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(Plates I—IV.) 

Introductory Note. 

The people with whom these papers will deal are those offi¬ 
cially called “ Anglo-Indians ” in India. They are not, however, 
the Anglo-Indian of English literature and common parlance 
in which the term is applied to persons of English, or rather 
British, birth who have spent a considerable part of their lives in 
India. Some years ago the Government of India, seeking to 
avoid the associations that had grown up round the name Eura¬ 
sian, decided that persons of mixed Indian and European blood 
should be known henceforth as Anglo-Indians. 1 2 The word Eura¬ 
sians had itself been invented to avoid a coarser and more des¬ 
criptive term. That even the more recent designation was inac¬ 
curate in point of fact was pointed out at the time of its intro¬ 
duction in a letter published in a Calcutta newspaper and signed 
<f Franco-Burman.’* The term Indian, indeed, had been stretched 
to include all native denizens of the Indian Empire—Burmese, 
Baluchis, etc., as well as Indians properly so-called; while it had 
been forgotten that any other European nation but the English 
had ever had a part in India. 

The observations on which Professor Mahalanobis’ analyses 
are based had their origin as follows. Ever since I began to take 
a serious interest in anthropometry, I have had doubts as to the 
value of bodily measurements taken on the living person. So 
long ago as 1903, 1 I pointed out that my own measurements of the 
faces of the people of the Faroe Islands were completely at vari¬ 
ance with those of a previous observer, and attributed the 
different results mainly to slight difference in technique. The 
working out of the measurements of the various tribes of the 
Malay Peninsula obtained in 1901-1902 3 by Mr. H. C. Robinson 
and myself increased my doubts, and further made me suspicious 


1 I understand, however, that as early as 1830 the term Anglo-Indians had 
already been applied to persons of mixed descent. 

2 Annandale, Proc . Roy. Soc. Edinburgh XXV, pp. 2-24 (1903). 

3 Annandale and Robinson* Fascicule Malayenses, Anthropology (1903— 
1904). 
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that there was some inherent falacy in the whole method. These 
measurements were taken with special care, each individual being 
measured three times over and most by two observers. Although 
they showed the gross differences in head-measurements between 
the civilized and the uncivilized tribes, they failed completely to 
demonstrate differences between the heads of the Negrito and of 
the Indonesian jungle tribes. 

Having in 1916 an opportunity of examining a number of 
Anglo-Indians anthropometrically, I determined to see whether 
my doubts were further justified by the investigation of a race 
known to be of recent mixed origin. Before discussing the methods 
adopted, I must say a few words about my subjects. They were 
with very few exceptions, young men between the ages of 18 and 
40, and with few exceptions belonged to what I may call the 
middle class of so-called Anglo-Indians, mostly employed as clerks, 
mechanical engineers, overseers and so forth, or else fresh from 
school and about to take up employment of the kind. The fact 
is of importance, for social distinctions are somewhat rigidly 
maintained in this community. I am indebted to Mr. H. A. 
Stark, late Principal, Dacca Training College, now Principal 
Armenian College, Calcutta, for valuable information on the po 
Among the Anglo-Indian community of Calcutta some families 
claim descent from Mahommedan ladies of noble and even prince¬ 
ly birth, who in the old days entered into alliances of a perfectly 
regular kind from a Mahommedan point of view with Englishmen 
of good birth. These families are, however, comparatively few. 
At the other end of the social scale are the “ Kintalis”, 1 whose 
origin is thus described by Mr. Stark in a lecture on “ Calcutta in 
Slavery Days ’ ’ read before the Calcutta Social Study Society on 
March 13th, 1916. 

“The liberated slaves [who, as Mr. Stark had previously 
explained, were mainly Indians but included not a few Negros] 
unbeknown to themselves that they had been doing what the 
Manumitted Roman slaves had done centuries before, in gratitude 
assumed the surnames of their late masters. Their descendants, 
for the most part, survive in the “Kintal” population of the 
city.’’ 

If this were a full statement of the case, it might be doubted 
whether the Kintalis have any real claim to be of mixed race, 
unless there is some slight admixture of Negro blood ; but, as in 
all cities, there is a tendency for certain individuals of the more 
respectable classes to sink down to the slums and become a part 
of the submerged population, which is represented in Calcutta, 
so far as the Christian communities are concerned, by the Kintalis. 

Be this as it may, few or no Kintalis are among the persons 
I measured, and probably none of very old family. So far as 
possible, moreover, we have eliminated from the measurements 

l The name is derived from the lodging-houses ( Kintal) in which many of 
these: people live or lived. The word Kintal, however, now means little more 
than a slum inhabited by low-class Christians. 
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analysed those of persons known to have recent Negro or Mongo¬ 
loid blood, i.e. persons one of whose parents or grandparents 
was a Negro or belonged to a Mongoloid stock. This has been a 
necessary precaution, because the number of individuals in which 
the further complexity was introduced was large enough to affect 
the results without being sufficiently numerous to afford a sound 
basis for mathematical treatment. So far as recent Negro blood 
was concerned I was fairly confident in accepting the statements 
of those who offered themselves for measurement, as certain, not 
by any means all, Negro traits were present. I refer particularly 
to woolly hair, dark complexion, negroid nose and prognathism. 
The long lower limb and slender shin of the Negro, which is of a 
different type from that of the Indian, were not perpetuated in a 
single individual. 1 As to old Negro blood, no definite information 
w£s obtained. 

To eliminate the recent Mongoloid element from our inves¬ 
tigations was, however, a much less easy task and I am by no 
means sure that this has been done successfully. Here again I 
had to trust to the statements of individuals measured, but 
Mongoloid traits are often reproduced in a much more subtle 
manner than Negroid, and the Mongoloid element in the popula¬ 
tion of Calcutta is much larger than the Negroid. Indeed, 1 
have observed that many of the most intelligent Anglo-Indians 
with whom I have had dealings have had distinctly Mongoloid 
features. This is not surprising, for the offspring of women of 
the various Mongoloid tribes of the Himalayas, Assam and 
Burma, who are not generally averse to unions of a more or less 
permanent nature with educated Europeans settled in their dis¬ 
tricts, are not only of respectable parentage in both lines but 
often receive a good education, and Calcutta is the natural goal of 
such people. So far as I could discover, it is unusual for an 
Anglo-Indian to know mugh of his family for more than two or 
three generations back and at the present time, in Calcutta at 
any rate, most of the community are the result of marriages of 
persons of mixed blood. 2 

The subjects.of my investigations were, therefore, mainly of 
mixed Indo-European blood, probably in many individuals with 
some Mongoloid admixture, but not affiliated with the higher 
Hindu castes. 

The measurements were taken in the zoological laboratory of 
the Indian Museum in the years 1916—1919. I had the help of 


1 As only about half a dozen Anglo-Indian-Negros were examined, I have 
refrained from giving details and merely cite the results for what they are worth. 
Recent Negro settlers in Calcutta are mostly West Indians. They and their 
families occupy a street practically by themselves. 

2 I may here note that further complexity is now being introduced into the 
Anglo-Indian community by the marriage of Anglo-Indian women to Canton 
Chinese, who are now numerous as cabinet-makers and bootmakers in Calcutta. 
These.men keep themselves entirely apart from the Indian communities and 
frequently marry Anglo-Indians, though the custom of bringing their wives from 
China is becoming much common than it was a few years ago. 
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several assistants, among whom I may mention in particular my 
late laboratory assistant Mr. J Gaunter, to whom I was indebted 
for obtaining many of my subjects. Dr. F. H. Gravely and 
Dr. K. S. Roy devoted much time and labour to helping me. 
The investigations were conducted in a less systematic manner 
than I would have wished, partly because they were in themselves 
of the nature of an experiment and I was perpetually attempting 
to discover more satisfactory methods, and partly because they 
had to be carried out at odd times, chiefly on vSundays and holi¬ 
days, when subjects were available. The measurements that 
have been utilised by Prof. Mahalanobis were, however, made on 
one system and with the same instruments. The system was that 
recommended in the British Association’s hand-book on anthro¬ 
pology and the instruments were the “ Anthropometer ” (112) and 
f< Instrumentantascher * ’ (203) supplied by Hermann of Zurich.. 

Prof. Mahalanobis has, in my opinion wisely, decided to 
treat the measurements as accurate only within 2 mm. He notes 
a tendency on my part to favour even numbers. Of this I was 
barely conscious at the time, but on attempting to reconstruct 
the process in my mind I seem to recollect that when I was not 
quite sure of a measurement within a millimetre, I had a preju¬ 
dice in favour of even numbers. I never thought it possible to 
measure to within less than a millimetre. It is curious, however, 
that this prejudice seems to have communicated itself to my assis¬ 
tants, by several of whom the measurements were occasionally 
taken while I noted them down. That it has done so is evidence 
at any rate of uniformity of method. 

The measurements, discussed without knowledge of mathe¬ 
matics, seemed to me so unsatisfactory that I had practically 
decided to reject them altogether, until I was so fortunate as to 
get into touch with Prof. Mahalanobis at the Nagpur meeting of 
the Indian Science Congress and he offered to analyse them 
statistically. The results he has already obtained seem to justify 
their publication, and to emphasize the value of co operation and 
co-ordination of different branches of scientific work in anthro¬ 
pology, without which, in my opinion, further progress in most 
branches of biology has become impossible. 

The special importance of investigations conducted on the 
Anglo-Indians lies in the fact that although we may not be able 
to trace out the’history of any one family, we know that the 
whole race, if such it may be called, has arisen practically within 
the last 200 years by the admixture of other pre-existing races. 
After Prof. Mahalanobis has discussed my measurements on 
mathematic lines, I hope to have an opportunity of considering 
other aspects of the somatology of this interesting community 
We hope thus to throw some light on the question of the origin of 
human races by fusion. 

N. Annandaee, 

Director , Zoological Survey of lndia t 
Calcutta. 
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SECTION I. GENERAL REMARKS. 


In the present paper I have attempted a statistical exami¬ 
nation of Anglo-Indian Stature based on Dr. Annandale* s records. 
The measurements were all taken by Dr. Annandale or in a few 
cases under his direct supervision. Thus the present material may 
be considered free from large fluctuating errors due to different 
personal bias of different observers. 


Nature of the Material. 


Dr. Annandale has explained in his introductory note the 
special character of the present material. After excluding 
“ Negro/’ “ West Indies ” “ Chinese,” “ Burmese” and “ Bhutia” 
ancestry and omitting certain incomplete and doubtful records a 
series of 200 was obtained for Stature, Head Length, Head Breadth, 
Nasal Length, Nasal Breadth, Zygomatic Breadth and Upper 
Face Length. 1 

The great importance of the present material from a biometri¬ 
cal standpoint will be easily appreciated. So far as I am aware 
this is the first time that a true biologically mixed population is 
being studied by statistical methods. 

From the statistical standpoint the coefficient of variability is 
considered to be a very important test of homogeneity.' 2 Hitherto 
all attempts to fix the upper limit of homogeneous variability were 
necessarily confined to the study of artificially made up mixtures.® 
The Anglo-Indian data furnish us with a “ natural mixture.” A 
careful study may be expected to throw considerable light on this 
vexed question. Incidentally, it will be of great interest to com¬ 
pare the variability of such a “mixed” population with those of 
“ purer ” races. 4 

The Anglo-Indian population may really represent a new 
race * in the making, and we hope to discuss in the se.quel what 
indications may be afforded by a study of the present material as 
regards the mechanism of race formation. 

It should be noted however that the word ‘ ‘ race ’ ’ is here* used 
in its statistical sense. Pearson 6 says, “Any race may originally 
have arisen from a mixture of races, but such a mixed race is 
wholly different from a mixture of races, which have not interbred.” 


J Arithmetical work on these characters is nearly finished and I hope to 
publish the results at an early date. y 

. • ^ * S true °/ course ^ or un *-modal data only, or more generally for distribu¬ 

tions which cannot be dissected into component, frequency groups. For a fuller* 
discussion of this point see pp. 34, 93-94* 

!' M yfr’ s -Man, February, 1903, pp. 28-32. Also see Karl Pearson's 
discussions on this point in Biometnka V0I.2, 1903, pp. 345-347, Myers' Reply 
and Pearson’s Remarks on this Reply in Biometrika Vol.^o>3..PP 504-508 
Purer in a statistical sense, i.e. more homogeneous. 

6 Biometnka Vol. 2, 1903, p. 506. 
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The special significance of the present material is that it does re¬ 
present a mixed race which has interbred and whose component 
races are still in a pure form. 

Plan and Scope of the Paper. 

Dr. Ann and ale took a very large number of measurements 
extending to forty different characters. But the records are not 
complete in each case. As I have already mentioned a series of 
200 has been obtained for seven 1 metric characters. A second 
group 2 consists of about 120 to 180 and a third 8 of 50 to 100 
complete records. In addition eye and skin colour were recorded, 
as also observations on hairyness in all cases. 

In the present paper the frequency distribution and variability 
of stature has been discussed at some length. Certain points 
have been considered in great detail, much of which it will not be 
necessary to repeat in subsequent parts. 

The second part (material for which is nearly ready) will contain 
a study of the frequency distribution and variability of individual 
organs included in the first group. Correlation between the or¬ 
gans of the first group will be next discussed and after that the 
study of the second and the third group will be taken up. Finally 
I hope to discuss the distribution and correlation of eye, hair and 
skin-colour in a separate paper. 

I should make my position quite clear; I frankly confess 
that I know very little of anatomy. My work on the data supplied 
has been purely statistical. 

Some of the results may appear to be thoroughly unconvention¬ 
al or sometimes perhaps even startling in character. With such a 
short series, it is of course impossible to lay emphasis on the 
numerical value of any particular constant. But I would like 
to draw the attention of Anthropologists to statistically signi¬ 
ficant magnitudes as not unworthy of careful study I have 
contented myself with pointing out statistical results and have 
refrained from drawing Anthropological conclusions. 

The chief object of the present study is to invite the attention 
of Physical Anthropologists of India to the importance of the 
application of accurate statistical methods to their “crude” mea¬ 
surements. As some of the technical terms may be unfamiliar 


Stature, Head Length, Head Breadth, Nasal Length, Nasal Breadth, 
Zygomatic Breadth, Upper Face Length. 

2 (i) Gonial breadth 181. (ii) Frontal breadth 142. (iii) Shoulder breadth 

171. (iv) Thigh breadth 171. (v) Height of knee-joint, inside 174. (vi) Height 

of knee-joint, outside 120. (vii) Height of middlle finger 132. (viiij Styloid 
height 167. (ix) Trochanter height 180. (x) Iliac height 175. (xi) Upper 

radius height 118. (xii) Suprasternal height 119. (xiii) Acromion height 181. 
(xiv) Leg length 174. (xv) Chest, extended 137. 

3 (i) Total face length 93. (ii) External orbital breadth 93. (iii) Ocular 

breadth 91. (iv) Distance between eyes 87. (v) Chest, depressed 88. (vi) 
Kneeling height 87. (vii) Sitting height 93. (viii) Earhole height 87, (ix) 
Span of arms 93. (x) Cubit 87. (xi) Hand length 76. (xii) Humerus length 48. 

(xiii) Radius length 48. (xiv) Foot length 78. (xv) Foot breadth 78. 
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to Anthropologists, I have thought it advisable to include short 
explanatory notes, which would have been unnecessary in a purely 
Biometrical paper. 

I must also offer my apologies to the trained statistician. 
Much of the work will no doubt appear to him to be quite superflu¬ 
ous. I would remind him that one of our objects has been to 
persuade the Anthropologists to adopt statistical methods. This 
has necessitated detailed consideration of certain points which 
may appear obvious to a trained statistician. 

For example, a very full discussion of the effect of grouping 
has been given. All frequency constants were calculated several 
times over with very different units of grouping. It is then shown 
that the effect of grouping is quite negligible within very wide ■ 
limits—a result which is of course quite familiar to all statisticians. 1 
But as I found very wide-spread popular misapprehension regard¬ 
ing this point I have considered it desirable to give an actual 
empirical demonstration of the above fact. The discussion of 
various “ correction” for grouping will have its own interest to the 
statistician. 

Another consideration has guided me in this introductory 
paper. Any extension of a scientific method to new material 
requires caution. Our Anglo-Indian data cannot be assumed to be 
homogeneous in character, hence I have thought it desirable to 
justify empirically the application of statistical methods to such 
mixed data as the present material. The assumption of “ nor* 
mality * ’ (i.e. of approximately Gaussian distribution) thoroughly 
permeates many important statistical methods. It was therefore 
necessary to investigate the question of frequency distribution in 
great detail. 

The arithmetical labour has been very great specially as I 
did not have any modern calculating machine to help me. This 
want of mechanical accuracy may have introduced some uncer¬ 
tainty in the arithmetical results and this is why I have quoted 
the arithmetic very fully in order to facilitate checking by others. 
In the case of important “moments,” I have checked them 
absolutely by working with different start points (i.e. different 
base numbers). 

This is my first venture into the province of Biometry and 
it is not unlikely that I have made mistakes. I have included 
full details of the statistical work in the hope that competent 
Biometricians will kindly help me by pointing out errors. I 
have retained six places of decimal in the arithmetic, not in the 
vain hope of reaching an impossible degree of accuracy, but for 
convenience of checking. It is difficult to attain agreement to the 
second place in the final results unless about six figures are 
retained in the intermediate calculations in this type of work. 

1 K. Pearson, “ Errors of Judgment &c.” Phil. Trans. Roy. Soc. Vol. 198A 
(1902) “ Assortative Mating in Man.” Biometrika Vol. 2, 1903, p. 485. The 
authors note that “the system of grouping adopted is within wide limits imma¬ 
terial.” 
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I have intentionally made the present analysis very elaborate. 
A total of only 200 observations did not perhaps merit such close 
scrutiny. As there was no early prospect of increasing this total 
considerably, I thought it better to complete even a provisional 
investigation thoroughly rather than wait indefinitely for a larger 
sample. But the chief reason which prompted me to make an 
intensive study of the small available amount of material is this, 
so far as I am aware no work in this line has been done in India 
no Anthropologist in India has ever made any use of the modern 
statistical calculus associated specially with the name of Karl 
Pearson and the Biometric School. The present study is intended 
to illustrate the urgent necessity of the application of statistical 
methods to Anthropology. The conclusions based on only 200 
observations cannot of course claim any degree of finality. But 
these serve to show the kind of results which can be reached 
by statistical methods and also show the great scope and huge 
possibilities of statistical methods. 

Remarks on the Application of Statistical Methods. 

Before proceeding to the more systematic part of the work I 
wish to make a few general observations on the application of 
Statistical methods. I cannot do better than begin by quoting 
some remarks of Charles Goring in this connection. 1 

"Statistical enquiry, all scientific enquiry, is observational in 
character: that is to say, it is based upon the observation of in* 
dividual facts. But these facts, in themselves, do not constitute 
knowledge. Knowledge consists in the discovery of relation¬ 
ships revealed by the systematic study, and by the legi.timatised 
weighing of facts.” 

“ No series of biological or social observations constitutes 
knowledge in itself. Knowledge lies potential in the facts, but 
ineffectual for use until their associations with each other have 
been accurately weighed. It is the weighing of observations 
which demands for the present enquiry, the employment of statis¬ 
tical methods: such methods being merely a regulated mechanism 
by which the relation between certain order of facts can be precise¬ 
ly determined.” 

“ There is not, as is sometimes imagined, any special theory 
or hypothesis involved in conclusions revealed by statistics. The 
science of statistics provides only for the systematised study and 
legitimatised interpretation of observed facts: such interpretation 
consisting mainly in one and the same process—the associating or 
dissociating one set of facts with and from another. Before any 
association can be legitimately postulated, certain conditions must 
be fulfilled; evidence must be produced to show that the relation, 
affirmed to exist, is not a chance or accidental, but a natural asso- 


1 Charles Goring, The English Convict, pp. 19-20 (H.M.S.O. 1913) 
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ciation ; that it is not one resulting from coincidence, but that it 
represents an inseparable connection between natural phenomena.” 

“ The attributes and conditions of living things are so widely 
variable, are so delicately graduated in different individuals that 
their correlation can seldom be legitimately postulated, and can 
never be precisely estimated, without aid from a correlation 
calculus: that is to say, social science almost entirely, and biolo¬ 
gical and medical sciences to great extent, can only be built up 
after preliminary mathematical analysis of large series of carefully 
collected data ’ ’ This is the reason why we assert that statistical 
methods are indispensable for our present enquiry. 

We have got Anthropometric measurements of 200 Anglo- 
Indians as our material in the present case. We know that this 
constitutes only a very small sample of the whole Anglo-Indian 
population. We wish to investigate the Anthropometric charac¬ 
teristics of the whole population but we are constrained to do so 
from a study of the sample alone. If the sample exhibits certain 
typical features, we shall be justified in inferring the presence of 
these typical features in the general population. Thus our first 
statistical task is to find out the typical features of our sample. 
In order to do so, it is necessary to describe the given sample by 
means of a suitable typical curve, that is, to graduate 'the given 
sample suitably. 

This very process of graduation itself will f ‘ smooth out ” the 
irregularities peculiar to the particular sample considered. Hence 
when a typical formula is once obtained we get rid of the special 
individual peculiarities of the given sample and can replace the 
given sample by our graduated curve in all subsequent discussions. 
This graduated curve is, by logical induction, assumed to be typical 
of the whole population. 

This typical frequency curve is defined by certain statistical 
constants 1 calculated from the measurements actually given in the 
sample. The reliability of each constant is determined by the 
internal consistency or uniformity of the particular set of measure¬ 
ments from which it is derived (and the total number of measure¬ 
ments). The reliability (measured by the probable error) can be 
precisely calculated with the help of the statistical calculus based 
on the theory of probabilities. 

Thus in any statistical enquiry the first part of the work con¬ 
sists in the determining of the appropriate frequency constants 
and their probable errors. This.is done in section II of the pre¬ 
sent paper, which also contains an elaborate technical discussion 
of the effect of grouping. 

The next part of our work consists in constructing a type 
which is assumed to be true for the general population, within the 
limits of the probable error of .the type. This is the problem dis¬ 
cussed in section IV 


1 I have given a short account of some of these constants in non technical 
language in Appendix I. pp. 90—94. 




1922.] P C. Mahatanobis : Analysis of Stature. 


11 


Once the typical curve is built up we can proceed to compa¬ 
rison with other general populations as represented by their own 
typical formulae. Goring observes ff no valid comparison between 
two series of statistics is possible until the constants of each series 
have been determined.” 1 

But even then, no conclusion can be safely asserted from the 
comparison, until a certain condition has been fulfilled. f * Before 
drawing conclusions from the comparison of statistics, we must be 
certain that we are dealing with strictly random samples of the 
same homogeneous material ” (italics mine). 

This introduces the second part of our work. For valid 
comparison we must investigate the homogeneity (or otherwise) of 
our material. I have discussed the statistical tests of homo¬ 
geneity in section III, and the application of these tests in 
section V 

We then pass 011 to the question of comparison with other 
data. In section VI, I have considered the nature oi the material 
for comparison and in the next section (section VII) I have in¬ 
vestigated the question of comparative homogeneity in great 
detail. 

In section VIII, I have added a preliminary note on the 
variation of stature with age. I shall discuss the question of age 
correlation and growth in a later paper. 


1 Cf. Goring, p. 33. “ In order that complex groups such as two series of 

measurements, may be compared, these have to be reduced to a simple form, to 
the genius, as it were, of the series, i.e. certain values, called constants (the 
mean, mode, standard deviation, etc.), have to be extracted; and the groups 
compared through the medium of their constants. These values, however, are 
only themselves comparable in certain conditions. First, we must know that the 
statistics they represent are not chaotic in their distribution that the sequence of 
their frequencies have been determined by law. And, secondly, we must know 
the range of error to be discounted before any actual differences between the 
constants compared may be regarded as significant. Before we can assert that 
one series of measurements inherently differs from another, we must predict and 
allow for a certain amount of difference or arithmetical inexactness, which, 
according to the law of probability, is bound to appear in limited samples of the 
same homogeneous material. This predicted amount of insignificant difference 
is called, as we have already said, the probable error of the constants under 
consideration.” 

” Briefly resumed the matter stands thus: we must compare, not this 

or that particular measurement, but the whole series of measurements obtained 
from a random sample of (one population) with a similar whole series obtained 
from a random sample of (another) population. In brder to make this compari¬ 
son two things will be necessary : w'e must extract from each series its statistical 
constants, the mean, the standard deviation, etc., of the series : and by the 
theory of probability, we must determine for each constant obtained, its probable 
error. These constants, with their probable errors, will be the representatives 
of th'e series, which, through their medium, become comparable with each other. 
If the differences between the results compared are not greater than the probable 
errors of these results, such differences may be regarded as insignificant: if the 
difference is not greater than twice the probable error, it may be regarded as 
probably insignificat r ; and if it is not greater than three times the probable error, 
it may be regarded as possibly insignificant. On the other hand, if any differ¬ 
ence found is greater than three times the probable error, it is reasonable to 
assume that the difference is due to some definite influence over and above those 
causes which are inherent in the sampling process.” 
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The ravv material in the form of the actual measurements, 
has been included in Appendix II. 

“Tables,” throughout the present paper, have reference to 
the indispensable volume edited by Karl Pearson, “ Tables for 
Statisticians and Biometricians” (Cambridge University Press, 

1914). 


Note on “Bias” in recording measurements. 

It is well known that different observers are affected with 
different ‘ personal bias ’ in taking measurements. In the present 
case the crude data showed an overwhelming preponderance of 
“ even ” readings as against ‘ odd” measurements. 

In the case of Stature, we find no less than 193 “even” 
reading as against only 7 “ odd.” We have no reason to believe 
that Nature has any special preference for “ even” number of 
millimeters, hence, apart from personal bias and fluctuations due to 
random sampling we should have had 100 “even” and “odd” 
readings each. Instead of this, we actually get 193 and 7. 

The presence of “bias” is obvious, but I have calculated 
the “ Contingency ” 1 for the whole group of the above seven 
measurements. 


Tab ee 1. 

Contingency for “ bias .” 


Organ. 

Theoretical 

value. 

Observed 

value. 

m—m'. 


\ m ) 

Stature 

100 

193 

93 

86*49 

Head Length 

IOO 

174 

74 

5476 

Head Breadth 

IOO 

181 

81 

65*61 

Nasal Length 

IOO 

hi 

11 

1*21 

Nasal Breadth 

IOO 

' 93 

7 

0*49 

Zyg. Breadth 

IOO 

156 

56 

3 i *36 

Upper Face Length 

IOO 

i°S 

5 

0*25 

«'=7 

X 2 =240*I7 


The probability that “ random sampling would lead to as 
large or larger deviation between theory and observation is given by 


D i-o , %' 2 x 4 1 

P =•. e- 4*’ j 1 + — + — j 
2 2*4 ) 


r % % 

log P= -\X 2 log l0 e + log|i+^- + ^- | 


1 Karl Pearson: Phil. Mag. Vol. L, pp. 157-175. i 9 00 - 
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log P= -l O g, 0 « + log i0 | I + ?^£7 + 5|y 

In nr T >— _£ o • T C ooCo^n 


— 47'7 I 3 02 9 

Thus P- 5 'iy x io~ 41 , or the chances are 2 x 10* 6 to 1 against 
there being no bias. 

In the case of Stature the unit of grouping is greater than 
10 mm. and hence this preponderance of even values of millimetres 
is not a matter of great consequence. 



SECTION II. EFFECT OF GROUPING ON THE 
FREQUENCY CONSTANTS. 

Frequency constants and probable errors. 

The object of the enquiry contained in this section may be 
best explained in Karl Pearson’s words. 1 

“It is well known that if the distribution of errors follows 
the normal law, the “ best’' method of finding the mean is to 
add up all the errors and divide by their number, the r< best” 
method of finding the square of the standard deviation is to form 
the squares of the deviations from the mean and divide by their 
number... . These “best” methods become far too laborious 
in practice when the deviations run into hundreds or even thou¬ 
sands. The deviations are then grouped together, each group con¬ 
taining all deviations falling within a certain small range of quan¬ 
tity, and the means, standard deviations, and correlations are 
deduced from these grouped observations. If the means, stand¬ 
ard deviations, and correlations be calculated from the grouped 
frequencies as if these frequencies were actually the frequency of 
deviations coinciding with the midpoints of the small ranges 
which serve for the basis of the grouping, vie do not obtain the 
same values as in the cases of the ungrouped observations. It 
becomes of some importance what corrective terms ought to be 
applied to make the grouped and ungrouped results accord. This 
point has been considered by Mr. W F Sheppard (who has pro¬ 
posed certain corrections). Thus corrected the values of the con¬ 
stants of the distribution as found from the ungrouped and grouped 
deviations will nearly, but not of course absolutely, coincide.’ ’ 

In this section I have calculated both ungrouped and grouped 
constants with widely differing units of grouping. The constants 
as corrected by Sheppard’s formulae have also been calculated in 
each case. By a comparison of the different constants we find 
that within very wide limits the effect of grouping is negligible. 

The Stature list was classified into groups of 50 mm. The 
base number is taken to be 1655 mm. and the moment coefficients 
were calculated as shown below.’ 2 * * * & 

We get the following table for “ raw ’ ’ moments about 1655 :— 


1 Karl Pearson : ‘‘ On the Mathematical Theory of Errors of Judgment and 
on the Personal Equation,” Phil. Trans. Roy. Soc., Vol. 198A, 1902, pp. 249, 
250. 

2 For details, see K. Pearson: “'On the Systematic Fitting of Curves, etc.” 

Part I, Biometrika, Vol. 1 , 1902, pp. 265—303 and Vol. II, 1902, pp. 1—24. Also 

W. Palin Elderton “ Frequency Curves and Correlation,” pp. 13—19. (C. and 

E. Layton, 1917) and G. Udney Yule: "Theory of Statistics” (Charles Griffin 

& Co.) 
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Group 

(mm.) 

Mid-Ordinate. 

X 

y— Frequency. 

xy 

X 2 y 

xi'y 

x*-y 

x h-y 

x 6 y 

1430—1480 . 

1455 

-4 


— 12 

48 

— 192 

7 68 

-30 72 

1 22 88 

1480—1530 . 

1505 

-3 

5 

-15 

45 

-135 

4 05 

— 12 15 

36 45 

1530—1580 . 

1555 

— 2 

14 

-28 

56 

—112 

2 24 

'—24 48 

8 96 

I58O—I63O • 

1605 

— I 

45 

-45 

45 

-45 

45 

-45 

. 

45 

1630—1680 . 

1655 

O 

60 

—100 


-484 


-47 80 


1680—1730 . 

1705 

n 

48 

48 l 

48 

48 

48 

48 

48 

1730—1780 . 

1755 


20 

40 

80 

160 

3 20 

6 40 

12 80 

1780—1830 . 

1805 

nn 

3 

9 

27 

8t 

2 43 

7 29 

21 87 

1830—1880 . 

1855 

B 

2 

8 

32 

128 

5 12 

20 48 

81 92 




l 

+ 105 


+ 417 


+ 34 65 


Totai, 



• 200 

i 

i +5 

I 

j 38 i 

-67 

25 65 

-13 15 

2 85 81 


Dividing by the total, 200, we get for the “ raw’ ’ moments, 
S denoting a summation for all groups. 


/ c (xy) . 

”■ =S N + 

*025 

v 2 — ^ - , T ■ = + 
i N 

1-905 

V ,_o (x s y)__ _ 
1/3 N 

o *335 

V * =S N + 

12*825 

, 0 (% b y) _ 

Vc =0 — 

6 N 

6*575 

. . _ (x 6 :y) 

v 6 ' = 5 —+142*905 


The true Mean is given by 

1655 + (-025 x 50) = 1656*15 mm. 

Transferring 1 to the true Mean with the help of ;— 

r /•! 

Pi = v 2 ~ v l 

u„ = Vo' — 3^1 Vo + 2vi ,& 

6./V/- 3 v,^ 

n - Vff - 5 V l V t + - IOVi'V + W 6 - 


1 Karl Pearson : “ Contributions to the Mathematical Theory of Evolution- 
On the Dissection of Asymmetrical Frequency-curves,” Phil. Traits. Roy. Soc., 
Vol. 185 A (1894), p. 71. 
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we get moments about Mean (without correction) 

t‘9 0 43 75 
= — ’47 7® 43 77 
P 4 = 12-86 56 42 58 
/* B = -8-i8 04 93 98. 

The moments were checked by calculating the “raw” 
moments about 143-0 cm. (end of range) as base unit The 
“ raw ” moments were 

^,'=-4-52 5, v/= -22-38, -118*02 625, v 4 ' = -65742 75, 

V 6 '= -3846-6203125, 

but after transferring to the Mean, the same values as before were 
obtained. 

The Standard Deviation 1 (S.D.) is given by a—s/ p 2 
Thus o-=+ 1-38 in working units 
= + 69-00 mm. 

The Coefficient of Variation 2 3 (F) is defined by and we get 

F = 4-1660. 

We must now proceed to find the other frequency constants 8 

0, = /' 3 a W ft* *033.204 

= 'ft* 3*547534 

Skewness = Sk. = • 069858 

where skewness = + . 

2(5ft-60,-9) 

Distance between Mode and Mean = d=ar x skewness. 

It is now necessary to find the Probable Errors. 4 * * * 


1 Also See Appendix I. 

2 Karl Pearson: “ Regression, Heredity and Pan-mixia,” Phil. Trans., hoy. 
Soc. Vol. 187A (1896), p. 203. See footnote on p. 34. 

3 (i) Karl Pearson :—" Skew Variation in Homogeneous Material,” Phil. 

Trans., Roy. Soc. Vol. 186A (1895), pp. 343—414, Supplement, Vol. 
197A (1901), pp. 443—459. 

(ii) Karl Pearson: “On the Mathematical Theory of Errors of Judg¬ 
ment,” Phil. Trans., Roy. Soc. Vol. 198A (1902), pp. 274—279 and 
P- 2 77* 

(111) “ Skew Frequency Curves,” Biometrika, Vol. 4 (1905), pp. 169—212 ; 

Biometrika Vol. 5 (1906), pp. 168—171 and pp. 172—175. 

(iv) W Palin Elderton :—“Frequency Curves and Correlation” (Charles 
and Edwin Layton, London) with Addendum and Errata, 1917. 

4 The fundamental memoirs are Karl Pearson and L. N. G. Filon: (a) “ On 

the Probable Errors of Frequency Constants and on the Influence of 
Random Selection on Variation and Correlation,” Phil. Trans. Roy. 

Soc., Vol. 191A (1898), pp. 229—311. 

(6) W F. Sheppard : “ On the application of the Theory of Error to cases 

of Normal Distribution and Normal Correlation,” Phil. Trans. 
Roy. Soc., Vol. 192A (1899), pp. 101—167. 
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The Probable" Error of Mean 1 

•6744898 

\/n 


<r = X|Cr. 


Probable Error of Standard Deviation 

•6744898 

= —/— • (T = X?:< T - 

V 2 n 


Probable Error of Coefficient of Variation 

= -6744828 v r ^ 2 ( Z.\y 
V2 n 1- V I0 o/ J 


We find, Probable Error of Mean = ©'32906 cm. 
Probable Error of S.D. =0-32267 cm. 
Probable Error of V =o # i4i66. 


The Probable Error of S.D. requires correction for skewness. 
The P.E. of S.D. 

_ '6744898 (T 


r *TT 


3 )) 


which reduces to the usual expression involving —z— for normal 

V 2 n 

curve, since 0 2 -3 = ° approximately in this case. Making this 
correction we get P.E. of S.D. = 0-3643 cm. This correction has 
been made in all subsequent work, but the difference made is not 
considerable in any case. 

The probable errors of (3\ and 0 2 > skewness and d were found 
from Table XXXVII, XXXVIII, XI and XII pp. 68-77 of Tables 
for Biometricians. 3 


Probable Errors of /?,. Table XXXVII p. 68. 

0i = *0332 

A = 3-5 ViV 2 ft = o + ^(r 37 ) = °- 9°<>9 


(c) “On the Probable Errors of Frequency Constants,” Biometrika Vol. 2 

( 1903 ). PP- 272. 

( d) Karl Pearson: “On the Mathematical Theory of Errors of Judg¬ 

ment,” Phil . Trans. Roy. Soc ., Vol. 198A (1902), pp. 274—279. 

(e) “ Probable Errors of Frequency Constants,” Part II, Biometrika, Vol. 9- 

( 1913 ). PP- 

1 Tables were published by W Gibson and Raymond Pearl (. Biometrika 
Vol. pp. 585—393) to facilitate the calculation of probable errors. These have been 
now reprinted as Tables V and VI in “ Tables for Satisticians and Biometricians ” 
(Cambridge University Press, 1914). 

2 Karl Pearson, Editorial Note on a paper by Raymond Pearl: “On Certain 
Points concerning the Probable Error of the Standard Deviation,” Biometrika 
Vol. 6 (1909), p. 117. 

a These tables were originally published by A. Rhind in Biometrika Vol. 7 
(1910), pp. 126-147 and pp. 386-397. Rhind gives an excellent summary of the 
whole subject. 
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W2, 

0 2 = 3*6 0 + w> (I ‘ 50) = °' 993 ° 

A = 3*54 75, VN 0*9069 + ^ 0 ('° 861 ) 

= *947 3 

Multiplying by X | = *'67449 /%/n - -04769 

we get, P.E. of 0 t = -045201. 

Then from Table XXXVIII, p. 71. 

0i = *03 32 

As= 3-5 v'NSt-10-85 + ^(o' 9 ) =114458 


3*6 


500 

332 


= 12-67+^- (1-07) = 13-3783 
5 


For 02 = 3*54 74 , n *44 + 


475 

1000 


(i* 9325 ) 


=12-3637 

hence P.E. of 0* = 5-89625 

From Table XI 4 , p. 76. 

ft = 3'5 VNt ek = r 3 i - p?(.o2) = r 3 o 87 


3*6 


1-32 — ——— X *02 = 1*3^ 87 

500 


A“ 3'54 75 . 1-3087+ ^ 5 .( 01) = 13134 


1000 

P.E. of Skewness = -o6 26 36 


We thus find 

Mean, M = 1656*25 +3-2906 mm. 

S.D. a- 69-00 +2-6431 mm. 

Coeff. of V, V= 4-1660 ± -1407 


The other constants are :— 

0i= *03 32 04+ -04 52 01 
02 = 3*54 75 34 ±*58 96 25 
Skewness — sk — *06 98 58 + -o6 26 36 

We thus find that the skewness is not significant. Hence we 
are justified in assuming normal distribution, at least to a first 
approximation. 

On this assumption we can find the P.E, of the moments 
quite easily. 
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The S.D. of any moment i> q in a sample of size n is given by 1 2 3 * * 

n.%fiq =/*2 q 2,<J}>q + | f J q~i + _ j 

P.'E. of = '67449V - . n 2 = -o6 74 49>2 
P.E. of M3 ='67449*/ - ♦ ^ =-n 68 i 6 \r« 

V w 

P.E. of ‘67449V^ /*** -27 96 56> 2 a 
For # = 5> we must find /q 0 . 

But Sheppard * has shown that for the normal curve (in our 
present notation) 


M * + l 


= 0 


/*£« ^(2S-l)(2S-3) 

s*„=2? 


I.^ 2 * 


M5=— fV 
n 


Hence we get 

Substituting in the above formula, we get 

Thus P.E. of n 6 = ‘ 6 y 44 9 / 7 — . <^=1-27 96 56-a 6 
We thus get:— 

/a 2 = 1-90 43 75 +0-I2 84 31 

‘47 78 43 77 + '30 69 85 
l\= 12-86 56 42 58±1-46 74 85 
/i 6 =-8-i8 04 93 98 + 6-40 45 50 


Sheppard’s Correction. 

I shall now consider the question of corrections for grouping. 
The theoretical work in this subject now consists of a good deal of 
literature. I shall discuss this question from a purely practical 
point of view. The fundamental memoir is W F Sheppard 8 : 
“ On the Calculation of the most Probable Values of Frequency 


1 “ On Probable Errors of Frequency Constants,” Biometrika Vol. 2 (1903), 
p. 276. 

2 W F. Sheppard: Phil. Trans. Roy. Soc. t 192A. 

3 ( a ) A summary of Sheppard’s memoir (with some new results) is given in 

an Editorial Note: “ On an Elemementary Proof of Sheppard's Formulae for cor¬ 
recting Raw Moments and on Other Allied Points ” in Biom. Vol. 3, pp. 308—310. 

(&) In Pearson’s paper: “ On Systematic Fitting of Curves, etc/’ Biom. Vols. 

1 and 2, this question has been discussed from a different standpoint. 

(c) Sheppard himself has given a simplified method of obtaining certain cor¬ 
rections in a later paper- "The Calculation of Moments of a Frequency-Distri¬ 
bution,” Biom. Vol. 5 ( 07), pp. 450—459. 

(<f) Eleanor Pairman and Karl Pearson have published a memoir: " On Cor¬ 
rections for the Moment-coefficients of Limited Range Frequency Distributions 
etc.” in Biom. Vol. 12 (1919), pp. 231—338, which I shall have occasion to 
discuss later on in greater detail. 
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Constants, for Data arranged according to Equidistant Divisions of 
Scale,” Proc. Lond. Math. Soc ., Vol. 29, pp. 353—380. 

In our notation the above correction (which is known as Shep¬ 
pard’s correction) is given by the following set of equations:— 

Mi' = V 

aV-V-tV h* 

aV ® v h - \ h 2 v 2 ' + 2-^-0 h 1 

A* fi ' = V -1 h % v{ + T V A 4 i',' 

= V-l A* »V + * A 1 v/- T fhr 

A is the length of the base unit, it is usually = 1 for working units. 

50 mm. unit of grouping. 

Making these corrections we find adjusted moments about 1655 
to be 

aV = * 02 5 > 1*82 16 67, /i 3 '= —o*34 12 50,. 

/A 4 ' = 11*90 16 67, /x 5 '=-6 29 21 87, ^' = 147*339783, 

Now transferring to Mean we get 

m 2 = 1*82 10 42 /i s = - *47 78 

“4 =ii *94 j 6 22 /' 6 = —778 23 

Hence we finally get “ corrected ’ * constants : 

Mean= 1656*25 + 3*21 7 mm. 

S.D. = 67*47 3 +2*61 62 

Coeff. ofV, V = 4 '07 38 ± *13 7 6 

Pi= *03 78 10+ *05 41 33 

/? 8 = 3*60 10 + 71 20 69 

s& = *07 31 10+ *06 22 32 

d= 4*93 29 50 + 4*22 30 20 mm. 

Note .—Starting with 1430 as our base unit, we reach the same results, thus 
the arithmetic is absolutely checked in this case. 

The Frequency Constants were next calculated (both with 
and without Sheppard’s correction) for widely different units of 
grouping. We have 1 mm., 20 mm., 30 mm., 50 mm. and finally 100 
mm. as our unit of grouping. It will be observed that the unit of 
grouping is thus successively made the same, 10 times, 20 times, 
50 times and finally 100 times the unit of measurement. 

With “ ungrouped ” (i.e, 1 mm.) measurements, the arithme¬ 
tical labour is tremendous. In this case the maximum value of x 
is —210, which involves calculating (210) 4 for the fourth moment. 
Hence it was not possible to go beyond the fourth moment. As 
it is, the actual sum of fourth-products, i.e., S(x 4 "y) runs into 11 
figures. I quote actual results 

S(xy ) = 158 

S(x l y)=^ 90 82 72 

S(x s y) = — 6 76 88 78 
S(% 4 y) = 144 04 28 60 6^ 
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which gives us (dividing by 200) :— 

vj'= 79 

~ 45 4i -36 

v z- 3 48 44 *39 
»V = 7 2 02 14 30 *38 

For purposes of comparison it is necessary to reduce all 
moments to the same unit. 50 mm. was chosen as the standard unit. 1 
Tet hn be any moment in units of grouping h , let M n be the 

h 

corresponding moment in standard units h 0 , let p = ~ 

K 

Then M n = p n fx„ } is the formula of reduction to standard unit. 

For h 0 - 50 mm., P =—, ^ and 2 successively for units 

50 5 5 

of 1 mm., 20 mm., 30 mm. and 100 mm. respectively. 

The annexed table gives the Frequency Constants for the 
different units of grouping. I have added the probable errors in 
each case. 

For the purpose of studying the effect of grouping it is natural 
to take the “ ungrouped ” constants as our standard. We have 
accordingly assumed that the 1 mm. constants are the “ true ” 
constants. 

Different Values of Mean Stature. 

Unit of Grouping. 


I 

mm. 

16 5679 

+ 

3*23 

mm. 

20 

» 

16 56-85 

± 

3-23 

99 

30 

99 

16 56*35 

± 

3*23 

99 

50 

99 

16 56-25 

± 

3*22 

99 

100 

99 

16 59-50 

± 

3*09 

99 


When the unit of grouping is so large as 100 mm. (and the 
total record is divided only into 5 groups), there is considerable 
difference in the Mean. But this difference of 271 mm. is less than 
the probable error of over 3 mm. Thus even with 100 mm. group¬ 
ing, the Mean is stable within the limits of its own probable error. 

The agreement is almost perfect when we omit the 100 mm. 
group. The maximum “ error ” due to grouping amounts to only 
*54 mm., which is considerably less than the unit of measurement 
itself and is about ~ of the probable error. 

Tet us consider a very large sample of 7,500 individuals. It 
is not likely that the Standard Deviation will exceed 70 mm. The 
P.E. of Mean will be about *55 mm. The maximum observed 
difference in the present case , due to grouping , is thus of the same 
order as the random P.E. of the Mean in a sample of 7>500. We 
conclude therefore that for samples of 200, the effect of grouping on the 
Mean up to 50 mm. is quite negligible. 


1 For reasons explained on pp. 39-40. 
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Standard Deviation. 

Eet us first consider the results without Sheppard’s correction, 
i mm. 67-385 + 2-557 mm. 

20 „ 67-894 ± 2-547 „ 

30 „ 68-365 ± 2-444 „ 

50 „ 69-00 ± 2-679 „ 

100 ,, 70 922 ± 2-662 ,, 

With 100 mm. the difference is quite large. It is 3*537 mm - 
which is considerably greater than the prob. error. Omitting 100 
mm. we find the maximum difference to be 1*615 mm., which is 
considerable, but is still less than the P.E. Such a P.E. will be 
obtained with samples of 400. Thus the agreement without Shep¬ 
pard's correction is not very good. 

With Sheppard's correction 

1 mm. 67-385 + 2 557 mm. 

.20 „ 67-648 + 2-539 » 

30 „ 67-812 + 2-426 „ 

50 „ 67-473 + 2-619 „ 

100 „ 64-77 ± 2 432 „ 

100 mm. is again discrepant. The difference is 2*615 mm. 
which is of just the same order as the P.E. Evidently 100 mm. 
grouping is too broad and the error due to grouping is no longer 
negligible. This is also obvious from the fact that Sheppard’s 
correction makes the S.D. actually less than its true value, while 
the uncorrected value is considerably greater. 

Omitting 100 mm. the agreement is excellent. The maxi¬ 
mum difference (which is now m the 30 mm. group) is only '427 
mm., a value about a sixth of the probable error. It will require 
a sample of 6000 to produce a random error of the same amount. 

Thus with Sheppard’s correction, the effect of grouping is 
quite negligible up to 50 mm. These corrections are so easily 
applied that there can be no excuse for omitting theih. We have 
thus empirically verified the great importance of Sheppard’s 
correction in giving better values of the Frequency Constants. 
Henceforth it will not be necessary to compare the values obtained 
without Sheppard’s correction. 

Coefficient of variation : V — 

1 mm. 4-06 72 + *13 74 
20 „ 4-08 29 ± -13 79 

30 „ 4-09 41 + -13 83 

5 o „ 4-07 38 + *13 7 6 

100 „ 3 90 29 ± *13 18 

100 mm. is obviously incorrect, we may,omit this group from 
further consideration. The difference *1643 is greater than the 
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P.E. Omitting 100 mm. the maximum difference is *0269, which 
will be the P.B. in a random sample of 5,000 (with coeff. of varia¬ 
tion equal to 4). Thus the effect of grouping is of the same order 
as the effect of sampling in a group of 5, 000 . Hence we con¬ 
clude that different units of grouping do not introduce any appre¬ 
ciable errors in the Coefficient of Variation. 

From the Anthropological standpoint, the Mean, the S.D., 
and the Coeff. of Variation are the most important constants. 
For stature, with samples of 200 with Sheppard’s correction the 
effect of even such a large unit of grouping as 50 times the unit of 
measurement is in all these cases absolutely inappreciable. 

We shall however consider the other statistical constants 
before concluding this portion of our work. 


Values of /' 2 . 

With Sheppard* s correction :— 

1 mm. i*8i 62 61 + 12 25 05 

20 „ 1-83 04 98 ± -i2 34 77 

30 ,, 1-83 94 72 ± -12 40 83 

50 „ 1-82 10 42 + .12 28 40 

100 „ 1*67 87 + *ii 32 39 


100 mm. makes a difference of 1376, which is just about the 
same as the P.E. Otherwise the maximum difference is *0232 
which is only a sixth of the P.E. A random error of the same 
amount will be produced in samples of 2800. 


Let us now compare the values obtained without Sheppard 


correction: 


i mm. 

r8i 

62 

94 

± 

•12 

25 

07 

20 „ 

1-84 

38 

3 * 

+ 

•12 

43 

09 

30 » 

r86 

94 

72 

+ 

T2 

61 

06 

50 „ 

1-90 

43 

75 

+ 

•12 

84 

60 

100 „ 

2’0I 

20 


+ 

•13 

57 

07 



100 mm. introduces an error of '1958 which is considerab^ 
greater than the P.E. 

The effect of grouping has now become quite obvious, 20 mm., 
30 mm. and 50 mm. now introduce steadily increasing error. 
With 50 mm. the error has now amounted to *0881 which is only 
frds of the P.E. 

We thus see that Sheppard’s correction is absolutely indis¬ 
pensable here. With Sheppard’s correction the effect is quite 
negligible up to 50 mm. 

Values of / ; s . 

With Sheppard’s correction :— 

1 mm. = - '64 36 06 + *28 59 

20 ,, = - -30 87 16 + -29 27 

30 „ = - *46 86 97 + -29 86 

50 „ = - *47 78 44 + *30 7° 

= - *48 35 78 ± *33 33 


100 


i 9 
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ioo mm. is not at all worse than others. The maximum error 
(which now occurs in the 20 mm. group) *3349 just exceeds the 
P.E. 

Without Sheppard’s correction:— 


20 mm. 

= '3° 87 

+ 

00 

CM 

93 

30 „ 

= .46 55 

± 

*29 

14 

50 „ 

= -47 78 

± 

•28 

7i 

100 „ 

= *66 40 

+ 

•25 

39 


Evidently Sheppard’s correction does not produce substan¬ 
tial improvements. In this case the gross P.E. of is of the 
same order as /w 3 itself and hence there is wide fluctuation in the 
result. 

In view of the large P.E. we cannot sa}' that grouping makes 
any significant difference. The. asymmetry is very slight and ver}' 
nearly zero, thus the fluctuations though large are not statistically 
significant. These wide fluctuations indicate the critical approach 
to the Gaussian curve. 

Values of /w 4 . 

With Sheppard’s correction :— 


1 

mm. 

II 56 10 21 


i*54 15 

20 


11*56 54 26 

+ 

1*56 59 

30 

99 

10*97 ri 78 

± 

00 

O 

00 

yn 

H 

50 

9 9 

11*94 16 22 

± 

i-54 97 

100 

99 

10*35 96 

± 

i*3i 58 


100 mm. makes a difference of I'20i4 which nearly equals the 
P .E. Otherwise the agreement is good. The maximum error is 
•59 (in the 30 mm. group) which is much less than the P.E. 
Random error of the same amount will require samples of 1300 
i ndividuals. 

Without Sheppard’s correction the agreement is much worse. 
We have 


1 mm. 

11*56 10 21 

+ 

i'54 i5 

20 „ 

11*70 20 

± 

1-58 87 

30. „ 

11*30 37 

± 

1*63 31 

50 „ 

12*86 56 42 

± 

1*69 46 

100 „ 

13*98 12 48 

± 

1*89 13 


100 mm. has become too “ rough ’ ’ and 50 mm. itself introduces 
an error of about the same order as the P.E. Thus Sheppard’s 
corrections make substantial improvement in the results. The 
percentage probable error of /u 4 for normal curves is given by 
ki\/ 96 = 

r 5 * 7 % in our case. In view of this large percentage varia¬ 
tion, observed agreement with different groupings is quite satis¬ 
factory. 
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Values of 

With Sheppard’s correction:— 1 

30 mm. -11-92 55 72 + 5*87 11 

50 ,, - 778 23 + 5 72 74 

100 „ -11-09 49 34 ± 4 66 78 

Without Sheppard's correction:— 

50 mm. — 8*i8 04 94 + 6 40 46 

100 „ -14*56 ± 7*34 65 

The gross prob. error is again of the same order as p 5 itself. 
Hence there is very wide fluctuation in its value and Sheppard’s 
correction is not important. It should be noted however that 
even now the maximum difference (inter se) is less than the P.E. 

Values of p fi . 

30 mm. 121-83 05 ± 29 93 43 
50 „ 15473 13 ± 29-03 98 

The percentage P.E. for normal curve is T ’ T \/4-80 95 = 32 63% 
With such large percentage variation it is quite idle to calculate the 
higher moments directly. * 

Pearson says in this connection 2 “Constants based on high 
moments will be practically idle. They may enable us to describe 
closely an individual random sample, but no safe argument can be 
drawn from this individual sample as to the general population at 
large, at any rate so far as the argument is based on the constants 
depending upon these high moments.” 


Values of f 3 ,. 


I 

111m. 

*06 87 56 

+ 

•07 97 

81 

20 

99 

*oi 55 38 

+ 

*01 93 

24 

30 

99 

*P 3 53 

+ 

*03 57 

68 

50 

99 

•03 78 10 

+ 

06 55 

55 

100 

99 

•04 94 90 


•06 31 



/ i * 

Remembering that £[ = —„, we are quite prepared for such 

/' 2 

wide fluctuations. It will be seen that differs from zero by just 
about the same amount as its own P.E. (calculated separately for 
each) which of course implies that there is a tendency towards /?, 
differing slightly form zero, but that with a small sample of 200 
this tendency has not become quite significant. The unit of 
grouping does not make any difference so far as this tendency is 


» 

1 On account of the great Arithmetical labour, it has not been found possible 
to calculate #15 and ne, with lower units of grouping. 

2 Draper’s Company Research Memoirs: “On the Genera! Theory of Skew 
Correlation and Non-Linear Regression,’’ p. 9. 
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concerned. With 50 mm. without correction, /?, is = *03 32 04 ±*04 
52 01. Thus Sheppard’s correction is not important. 


Values of f3 z . 


i mm. 

3-50 46 

± ‘60 17 

20 „ 

3'45 16 21 

± 49 72 97 

30 „ 

3-24 24 

± *35 44 89 

50 ,, 

3 60 10 00 

+ 71 20 69 

100 ., 

3’45 36 

± ‘48 51 

50 „ 

3*54 75 34 

+ -58 96 25 (without correction). 


Though /3 2 does not seem to differ significantly from 3, there is 
slight tendency towards lepto-kurtosis. 1 2 * 

The P.E. of fit for a Gaussian distribution is V24 and is 
about±*23 in our case. The magnitude of P.E. again shows the 
want of significant divergence from meso-kurtosis. 

The effect of grouping is evidently quite negligible. The 
above investigation has been most elaborate in character and is 
sufficient to justify the application of “grouped” statistical 
methods to our present material. 

The foregoing analysis may be summarized thus:— 

(1) With samples of 200, even such broad grouping as 100 mm. 
does not introduce errors greater-than the random error of sampling. 

(2) Up to 50 mm. the effect of grouping is absolutely negli¬ 
gible. In the case of the Mean, the S.D. and the Coeff. of Varia¬ 
tion, “ grouping error ’ ’ is of the same order as ‘ ‘ random error ’ ’ in 
samples of several thousands of individuals. 

(3) Sheppard’s correction leads to a very substantial improve¬ 
ment in the S.D. and the even moments. The odd moments 
(being near a critical value) are not affected very much. Speaking 
generally, Sheppard’s correction should never be omitted. 

(4) The percentage variation in the higher moments is too 
large to make it worth while calculating them directly. 

I speak with hesitation about another inference which may 
perhaps be drawn from the above investigation. Small errors of 
estimating stature—even up to perhaps a few mm. are not likely to 
affect the Mean value very considerably (provided these errors are 
random errors and not systematic). 

“ Full Corrections ” of Pairman and Pearson. 

We shall now consider certain “full corrections” recently 
discussed by Pairman and Pearson. 4 The object of the above 


1 K. Pearson: “Skew variation, a Rejoinder” Biovi. Vol. 4 (1906), p. 175 

Also appendix II. . 

2 Eleanor Pairman and K. Pearson : “On Corrections for the Moment- 
Coefficients of Limited Range Frequency Distributions when there are Finite or 

Infinite Ordinates and any Slopes at the Terminals of the Range.” Biom. Vol. 

12 (1919), PP- 231-258. 
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paper was to investigate the full corrections for curtailed blocks of 
frequency. 

The general shape of our curve showed that there was no 
significant cuttailing, still I thought it advisable to investigate 
this point more carefully. 

We choose 50 mm, unit of grouping as our standard and find 
“ raw ’ ’ moments about one end of range, i.e. 1430 mm. 


Raw Moments are: — 

< = 45250 

v % — 22-38 

V 3 ' = 118*0262 

<=657*4275 

Note .—These lead to the same moments 
about Mean as obtained from raw moments 
about 1655. Hence there is an absolute check 
on the Arithmetic. 


Instead of working with . » 4 ' (the proportional 

frequencies), we can work with y [} y 2 , the actual frequen¬ 
cies, and then divide the whole by 200. Thus we get the follow¬ 
ing (slightly modified) formulae from p. 233 of the paper cited 
above. 

«I = - dro <r l o{163 y, -163 y 2 + 137^3 “ 6 374 + I2y 5 } 

a. z =+»io tV{ 45y|-i09y 2 + io5y 3 -5in+ I0 yb) 
a - 6 =~ 27 io i 1 I 77i — 5472+ 6473-3474+ 77 b) 

«4=+T^o ( *37i — H7 2 + 1573- 974+ 275 } 

«b=-^o 7i- 472+ 673- 474+ 7 b) 

and for 6’s 


*1= 

+ 

' 2 0 0 

l 

6 0 

: { I 

37 Vp - 

-163 y P 

-1 + 

! 377 p 

-2 

637 p- 

3 + I2 7 p- 

4} 


1 

200 

1 

I 2 

{ 

45 y P - 

- i° 97 p 

-1 + 

i° 57 p* 

• 2 ~~ 

5 x 7 p- 

3 + i° 7 p- 



+ ~L- 
T 200 

i 

{ 

T 7 y P ' 

- 54 y P 

-1 + 

647 p 

-2” 

347 p- 

■3 + 

77 p- 


&4 = 

1 

200 


{ 

37 a 

- 11 7 p 

-1 + 

i 57 p 

-2“ 

97 p 

-3 + 

27 p- 

-4} 

li 

iO 

4. _J_ 
T 200 


{ 

7 p 

- 47 p 

-1 + 

67 p 

-2 

47 p 

-3 + 

7 p- 

-4) 

In out 

* case 














3, 

72 

= 5 , 

7 a = 

14 , 

74 = 

45 > 

7 b 

= 60 




y P = 

2, 

yp- 

-i = 3 , 

7 p -2 = 

20, 

7 p -3 = 

48, 

7 p -4 

= 60 




Stature in 
mm. 

Frequency 
= 7. 

1430-1480 

3 

1530 

5 

1580 

14 

1630 

45 

1680 

60 

1730 

48 

1780 

20 

1830 

3 • 

1880 

2 

1 

Total . 

200 
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Hence we obtain 

a,= +*05 oo 83 b l = + *oi 86 67 

a. z = --26 45 83 b,= — '00 62 50 

a s = +-54 12 50 b. d = --07 50 00 

a 4 =- -6o 50 00 6 4 = +*19 50 00 

+-26 50 00 b h = -10 50 00 

From these we obtain:— 

+W 2 o <=+'04 ii 60 16 
#i=V-6 l o V + 25V0 V = +*0I 98 75 00 
A, = a ./- T |« a/ = -*24 05 75 40 

5 2 = V-t1«V =--oi 39 88 09 

^3 = a | / -o 5 3 < + 2'}o< = +'00 82 31 24 

^3 = V-olV + 2T0V =+-02 41 81 55 

= ~ "2i 16 45 80 , 

=--02 38 75 00 

From Equations (xxii) to (xxv) on p. 240, we get the fully 
corrected raw moments to be :— 

M i= v i+i'i{ A \ + e i) 

IH = v i - T 2 + T¥ 0 { B't ~ A ?J 

H = V S - £ v \ + 1 " To (^3 + B i) + To/’ #2 + ffi B \ } 

I*4! = V* - h v t + 2T0 + {TTff (^4 ” B J “ TO P B S + 2U P %B l + 5 , } 

In our case the range /> — 9, and we get: — 
i>' = + {-oo 50 86 26} 

i‘ 2 =n- tV +{’°3 17 00 69} 
i>3= v z- i V + C 39 12 18 23} 
m 4 ' = V- £ V + C 45 67 19 58}+^ 

Where the curled brackets give the correction over and above 
Sheppard’s correction. 

Thus we get fully adjusted raw moments to be 

/V= 4*53 00 86 26 30 

22-32 83 67 35 85 
/'/= 117-28 62 18 23 12 
/V = 646-72 33 86 24 84 

Transferring to the Mean (which itself is now changed) we 
obtain the Moment-Coefficients about the Mean. 

Moments after ‘ ‘ full correction ’ ’ 

/*,= i-8o 66 86 
i l t - -0*23 20 97 

< u 4= 7*53 03 39 

and the Mean = 1656-5043 mm. 

with S.D. = 67-1950 mm. 
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Comparing with our “ standard” values we see evident signs 
of “over correction.” With such small samples as 200, the P.E. 
in terminal frequencies are too great to allow the a’s and b } s to 
be calculated with any degree of accuracy. The transfer of one 
individual from one group to another would seriously affect the 
results. 

In order to test this point, I next calculated the a’s and 6’s 
with a shorter sub-range, i.e. 40 mm. 

Thus P»=P.=^=§?=i-25 

h 40 

Hence a/ = (1-25)*. a g 

b/= (i'25) 8 . b 8 


1430-1470 

-1510 

-1550 

-1590 

-1630 

-1670 

1 

-1710 

-1750 

-1790 

-1830 

-1870 mm. 

Vl 

y 2 

ys 

y* . 

y& 


l 

yp -4 

1 

yp- 3 

yp- 2 

yp -1 

yp 

* 

2 

2*5 

6*o 

‘ 18*5 

38*0 

Sr*o 

4 °’° 

24-0 

150 

1*0 

2*0 


From these we get 

a, = + *oo 17 50 6, = + -09 36 67 

a 2 =-* 048333 h=-- 305 

a & — + -io 63= +-50 5 

a 4 =--n 42 

a B = + '04 b h = + *16 

leading to 

a{- +-oo 21 88 b{— + ’ii 70 83 

a% = --07 55 21 b./=- *47 65 63 

a 3 '= +-19 53 13 6 3 '= + -98 63 28 

«*'=-• 26-8555 V=- 1*025391 

a 5 '= +*I 2 20 70 & s ' = + -48 82 81 

These give 

1*2 = 1-89 48 76 86 

giving S.D. = 68*82 5 

r and Mean = 1656*66 58 mm. 

The values are again quite discrepant from those given above. 
With subrange of 25 mm. still more widely divergent values 
were obtained. 

Hence we are obliged to conclude that with small samples, 
the probable errors of the terminal frequencies are much too large 
to allow Pairman and Pearson’s “full corrections” being calcu¬ 
lated with accuracy. 
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The general conclusion of the above investigation is this. 

There is no indication of appreciable ‘ ‘ curtailing ’ ’ of our 
material. Further, with small samples, the “ abruptness coeffici¬ 
ents ” cannot be calculated with any reasonable degree of accuracy 
and these “ full corrections ” will necessarily have to be omitted. 
But we have already seen that Sheppard*s correction can be safely 
applied and should never be omitted. 



SECTION III. ON THE STATISTICAL TESTS OF 

HOMOGENEITY 

One of the main objects of our present enquiry is to investi¬ 
gate the “ homogeneity ” of our material. For this purpose it is 
necessary to have some precise definition of “ homogeneity.” I 
fully realise the great difficulties underlying any attempt at such 
a definition, but in order to avoid confusion of thought I have found 
it impossible to forego at least a working definition. I shall 
approach the problem from a purely statistical point of view. 

“ Homogeneity ” implies similarity and functional equivalency 
among the members of a group of any class of objects. When all 
the members are identical with respect to some definite property, 
homogeneity is perfect with reference to that particular property. 
This is the ideal limit of thought, but in practice it always remains a 
mere intellectual abstraction. 

Thus in actual practice diversity is always present. But if 
the similarity attains a certain intensity we can speak of the 
group as being homogeneous. The actual amount of similarity 
considered necessary to attain this intensity is of course a matter 
of practical convenience. A group which is homogeneous for one 
purpose may be quite heterogeneous for another. 1 2 

“ Homogeneity * ’ thus ultimately depends on our standard of 
discrimination.* If the actual difference between any two mem¬ 
bers of a group is less than our unit of discrimination, we can 
never become aware of this difference and the group will appear to 
be homogeneous. On the other hand if the actual difference is 
greater, heterogeneity will become evident. If our unit of dis¬ 
crimination is made indefinitely small and yet no heterogeneity is 
detected, we gradually approach identity, which is the ideal limit 
of thought. 

The concept of u homogeneity ” is thus essentially relative 
and practical. We can never have any absolute logical criterion 
of homogeneity. We must set up separate standards of homo¬ 
geneity in each case. To this extent the definition of homo¬ 
geneity is necessarily arbitrary and conventional. But having 
once* set up a standard we must rigidly adhere to it. We cannot 
give it up in the middle of a discussion on the plea of arbitrariness. 

The discriminant may be either qualitative or quantitative, 
in either case if should be precise and definite. 

We can now proceed to set up tests of homogeneity for our 
special purpose. 


1 Cf. K. Pearson Skew Variation,” Biom. Vol. 4 (1906), p. 176, 192 and 
P- 185. 

2 e.g. In statistics, the probable error is the fundamental discriminant, 
in Experimental Psychology the least perceptible difference is the ultimate 
unit. 
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From the statistical standpoint our first necessity is suitable 
graduation of the given sample. This is necessary in order to 
draw legitimate inferences about the general population from a 
study of the given sample. 1 Our first condition is :— 

I. We should he able to graduate the given sample by a smooth 
curve . That is, the given frequency distribution must be homotypic 2 3 4 5 6 7 8 
in character. s 

The goodness of fit can be tested by the Pearsonian Contin¬ 
gency Coefficient.* 

Possibility of graduation by a smooth curve is thus a necessary 
condition of statistical homogeneity. 

This is not however sufficient. All heterotypic curves are 
excluded, but a homotypic frequency curve need not necessarily be 
homogeneous. For example, it may well happen that a mixture 
of two different homogeneous samples is amenable to graduation 
by a homotypic curve. But even then if the given curve can be 
split up into simpler components we get direct evidence of hetero¬ 
geneity. 

II. Thus our second condition is that the sampled frequency 
curve should not be capable of being analysed^ into simpler real 9 
components . 

P'earson 1 has furnished us with a technical method for dissec¬ 
tion into two components. But failure in dissection may also 
imply that the curve is multi complex ( in character, i.e. that it is 
built up of more than two simple components. This second 
condition (impossibility of analysis) again though necessary, is yet 
not sufficient. 

The concept of functional equivalency provides us with 
another test. If we consider any sub-sample 1 it should be gener¬ 
ally equivalent to another sub-sample, that is, it should not differ 
significantly from other sub-samples. Thus we get:— 

III. The frequency constants of different sub-samples should 
agree within the limits of their own probable error? 


1 We assume throughout that all samples are random samples, that is, we 
definitely exclude heterogeneity due to mere “ bias ” in sampling. 

2 Homotypic curves will ordinarily include the Gaussian and the different 
Pearsonian skew curves. Other smooth curves (Edgeworth, Charlier, Thiele, 
Kapteyn etc.) may also be included. 

3 The possibility of suitable graduation of the present material has been 

discussed in Section IV, pp. 35-40. * 

4 The original memoir was given in Phil. Mag. 1900, pp. 157 —175. Fora 
discussion of its use in testing goodness of fit see L. Isserlis : “On the Represen¬ 
tation of Statistical Data.” Biometrika, Vol. XI (1917), pp. 418—425. 

5 The possibility of dissection of the present material has been investigated in 
Section VI. 

6 Negative and imaginary solutions are sometimes obtained; until we can 
give a consistent interpretation of these, it is perhaps safer to ignore such purely 
mathematical solutions. 

7 Memoir on Dissection of Curves, already cited Phil. Trans. Boy. Soc., 
185A (1894). 

8 Strictly speaking, the agreement of subsamples is only an indirect test of * 

homogeneity.- What it actually does serve to show is the representative character 
of the given sample. 
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This condition ensures that the sub-samples will not -differ 
significantly from the general sample. 1 

The above three tests are purely formal and have no reference 
to the nature of the material. We can proceed further by taking 
into consideration our previous experience of similar material. 

Tet us take the case of stature as an example. In all known 
cases stature distribution is either approximately Gaussian or is of 
Type IV or Type I. Consider the frequency distribution of some 
unknown sample. If we find that the curve though homotypic is 
J or U shaped, we are naturally suspicious about the homogeneity 
of the material. The curve may be smooth, it may successfully 
resist dissection, its sub-samples may agree quite well, yet in view 
of our previous experience we would, in the absence of other 
evidence, hesitate to call it homogeneous. 

IV Our fourth criterion is that the general nature of the 
sampled frequency should be the same as that of known homogeneous 
material. 

This criterion is quite empirical in character and its practical 
utility depends upon what exact significance we can attach to the 
concept of “ general nature of known frequency constants.” 
Though somewhat vague this condition is by no means useless. 


Let us suppose that the given sample is really heterogeneous in character. 
Consider a “ random ” subsample of the given sample. Now if this subsample is 
to be representative in character, it must include the same degree of heterogeneity 
as is present in the sample itself, that is, in order that it may be a “fair ” as well as 
a “random” subsample, it is necessary that it should be sufficiently large. 
Samples which are large enough to be “fair” will obviously agree among them¬ 
selves. Thus the agreement of large fair subsamples cannot reveal the want of 
homogeneity of the given sample. 

Now consider a subsample which is again “ random " but which is not suffici¬ 
ently large to include the same degree of heterogeneity as is present in the sample. 
Not being representative in character, it will not be surprising if these fail to 
agree. Thus want of agreement on the part of subsamples on account of 
their smallness of size will, not necessarily prove the existence of heterogeneity in 
the material. The lower limit of agreement of random subsamples may however 
be locked upon as a measure of homogeneity. 

In any case however, agreement of random subsamples does show that these 
subsamples are large enoughko be representative in character. The given sample, 
being larger than its own subsamples, will obviously be large enough to be 
representative in character. Thus the agreement of subsamples is a test of the 
representative character of the sample, rather than any evidence of the homo¬ 
geneity of the material. 

An example may help. Consider an ordinary black and white chess board. 
Let us look at this chessboard through a sighting hole. The size of this sighting 
hole determines the size of the sample. If this size is larger than the size of one 
of the squares then each sample will show a mixed patch. In this case subsamples 
would agree. On the other hand, if the size of the sighting hole is only a fraction 
of fhe size of a square, then some samples will show white, some black and others 
mixed patches. The lower limit, up to which samples agree is evidently a 
measure of the size of the discontinuities. Agreement of subsamples of ioo shows 
that 200 is large enough be representative in character in the present case. 

This implication serves as the basis of Pearson’s discussion of P.E. of 
sub-samples for comparison with the general sample. K. Pearson : ” Note on 
the Significant or Non-significant Character of a .sub-Sample drawn from a 
Sample.” Biometrika Vol. 5 (1906), pp. 181—183. 
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We require some further precise quantitative test. This is 
supplied by the variability (both absolute, as measured by the 
Standard Deviation and relative, as measured by the Coefficient 
of Variation) of the distribution. 1 2 

V The variability of the sample should not be significantly 
greater than the average variability of the same organ for known 
homogeneous material. 

The Coefficient of Variation, V (multiplication by 100 is merely 
for arithmetical convenience) is a straightforward measure of 
variabilit}'. It is of course possible to set up other standards by 

choosing some other function of the S.D. and Mean, > but it 

is quite unnecessary to enter into such subtleties in the present stage 
of our knowledge. 

It is quite easj^ to extend the above condition to the case of 
more than one organ. In that case we shall have to define va¬ 
riability by the generalised * or multiple probable error of the group 
of organs considered. 3 

We have thus got five different tests of “ homogeneity.* * It 
should be remembered*that we have all along discussed* statistical 
homogeneity. Whether statistical homogeneity necessarily implies 
anthropological homogeneity and vice versa, is a very difficult 
question, 4 into which I do not propose to enter. I confine 
myself to a consideration of purely statistical homogeneity. 


1 For a full discussion see Pearson: Chances of Death “Variation in Man 
and Woman,” pp. 255—377, specially pp. 272—286. Also Appendix I. 

2 K. Pearson and Alice Lee: “On the Generalised Probable Error in Mul¬ 
tiple Normal Correlation,” Biom. Vol. 6 (1908), pp. 59 - 68. .. 

3 Incidentally we may note that variability gives us a convenient method of 
defining a “ normal” group (in a medical, psychological or social sense) of indivi¬ 
duals. The normal group (with reference to some particular trait) consists of the 
individuals included between the Mean, M , and p times the S.D. <r, where p is an 
arbitrary number. Thus a “ normal ” individual is one who does not differ from 
the average type of his class by more than p <r. By a proper choice of p we can 
make our definition as elastic or as stringent as we please. We can also extend 
the definition to cover more than one single trait, with the help of the generalised 
or multiple probable erroi. 

4 K. Pearson : “ Craniological Notes. Homogeneity and Heterogeneity in 
Collections of Crania,” Biom. Vol. 2 (1903), pp. 345—347. Also see C. Myer’s 
Reply to above and Pearson’s Remarks on the Reply, Biom. Vol. 2 (1903), pp. 
504--508, and Aurel Von Torok’s Note and Pearson’s Reply. Ibid., pp. 508—510.* 



SECTION IV. TYPE OF CURVE AND “GOODNESS 

OF FIT” 


We, shall now test the f< goodness of fit * * with our <f normal *• 
curve K. Pearson 1 has shown how this may be done. He shows 
that 2 if 


x 2 = S 


r 


(m{ -mf -1 

m J ’ 


where S denotes a summation, m' and in are observed and 
theoretical values in each sub-group, then the chances of a 
system of errors with as great or greater frequency than that 
denoted by x 2 is given 


by 


[/// 

[ ///" 

f 


“ 4 ©-. 

e dx j dx 2 dx 3 


- 4 ®“ 

e dx^ . dx .j dx 3 


r 


-4*2 

e x n ~ l dx 
’ -4*2 

e x ”- 1 dx 


which reduces to for ri odd 



- \x~ 


P = e 


{ 


x l x i 

i + — +-+ 

2 2.4 


and n' even 


+ - 

2 4 


x n> ~ ® | 

6 (w' - 3) 5 


P = 




_^2!_ > 

13 5 («'-3)> 


Tables 8 have been calculated to facilitate calculation of P 
when x 1 is known. 

Pearson then shows 4 that if x 2 fot the sample is so small as to 
warrant us in speaking of the frequency distribution as a random 


, * K. Pearson : “ On the Criterion that a Given System of Deviations from 

the Probable in the Case of a Correlated System of Variables is such that it can be 
reasonably supposed to have arisen from Random Sampling.” Phil. Mag. July 
1900, p. 157; 

2 x 2 is thus quite easy to calculate ; it is given by 

2 _ / square of difference of theoretical and observed values\ 

V theoretical value of frequency / 

2 W. Palin Elderton : '‘.Tables for Testing Goodness of Fit.” Biom. Vol 
i (1902), pp. 155—163, Reprinted as Table XII on p. 26 of Tables for Statisti¬ 
cians, etc. 

* Pearson, paragraph 5 and following of reference 1. 
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variation of the frequency distribution determined from itself, then 
we may also speak of it as a random sample from a general popula¬ 
tion whose theoretical distribution differs only by quantities of the 
order of the probable errors of the constants from the distribution 
deduced from the observed sample. 

Thus if a curve is a good fit to a sample, to the same fineness 
of grouping it may he used to describe other samples from the same 
population. If a curve serves to any degree, it will serve for all 
rougher degrees, but it does not follow that it will suffice for 
still finer groupings. A good fit for a large sample would be a 
good fit for a smaller sample but not necessarily for a larger 
one. 1 2 

J shall test the Goodness of Fit for different groupings. 
I shall next compare the fit for the same grouping given by 
the slightly different values of the Standard Deviation calculated 
with different unit of grouping. This will test how the Goodness 
of Fit is affected by different units of grouping adopted in calculat¬ 
ing the frequency constants. 


Normal Curve. 


I have calculated the theoretical frequencies from the “ raw ’ * 
(i.e. uncorrected by Sheppard’s adjustment) values of the S.D. 
in some cases. For “ if the ordinates of a normal curve be 
calculated from the raw second moment value of the Standard 
Deviation, these ordinates will more closely represent the actual 
frequencies than do the ordinates of the true normal curve, which 
have to be corrected by the factor 


i + i- W 
24 


Xy* — (T* 


to obtain the actual frequencies.” 

If therefore our sole object is to compare observed and cal¬ 
culated frequencies for definite series of groups, there are advant¬ 
ages in using the “ raw ” second moment in the equation to 
the curve. Such a curve has been termed by Sheppard a “ spurious 
curve of frequency 


1 For a discussion of another test of Goodness of Fit proposed by Prof.. 
Edgeworth see a Note by L. Isserliss: "On the Representation of Statistical* 
Data” Biometrika 1917, pp. 418—425. 

2 Editorial Note: "On an Elementary Proof of Sheppard’s Formulae' 
for correcting Raw Moments and on other Allied Points,” Biom. Vol. 3 (1904), 
p.311. 
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Tabbe 2. 

Mean=1656*2938 mm. 

Unit = 20 mm. 

S.D.= 67*3845 mm: 1 / <r = *296802. 


Stature. 

Observed 

Value 

m '. 

Theoretical 

Value 

m . 

(m'— m ) 

( m '— tn ) 2 

m 

Beyond 1475 

3 

1*10 '22 

1*897 

3*265 

1475—1495 

I 

i*37 34 

0-373 

•101 

—ISIS 

4 

2-66 11 

i *338 

•673 

—1535 

2 

4*72 29 

2‘722 

1-569 

—1555 

4 

7-68 67 

3*686 

1-768 

—1575 

IO 

ii *45 7i 

i*457 

•185 

—1595 

12 

15-64 81 

3-684 

•850 

—1615 

25 

19*58 25 

5*418 

1*499 

—1635 

32 

22*45 35 

9*546 

4*058 

—1655 

21 

23-58 87 

2-588 

•284 

—1675 

17 

2271 04 

5710 

1*436 

—1695 

21*5 

20-23 05 

1*269 

•796 

—1715 

i 8*5 

i6'i8 89 

2311 

•330 

—1735 

10 

11*98 74 

1-987 

•329 

—1755 

5 

8*13 40 

3*134 

1-209 

—1775 

10 

6*20 48 

3795 

2-321 

—1795 

2 

173 25 

0*267 

•041 

—1815 

0 

i*5<> 37 

1*504 

1*504 

Beyond 1825 

2 

1 1*22 91 

0.770 

•482 


200 

200*19 75 


#2=22*699 


The above table gives observed and theoretical values for 20 
mm. grouping. These have been plotted both in histogram and 
in mid-ordinate continuous curve form. (See Plate I). 

The equation to the theoretical Gaussian is (in 20 mm. work¬ 
ing units):— 


Y = 23*682 x 



(1656*25 -X)* } 
36*3259 ) 


where X = stature in mm. 

Y = frequency. 

Moan = 16 56*29 38 mm. 
S.D.= 67*38 49 8 mm. 


Unit of grouping = 20 mm. 


In order to avoid fractions of individuals in theoretical values 
we stop at 1475 mm. and 1825 mm. 

with n' = 19 x* = 22*699 

From Table XII, p. 26 we find 
for X* = 22 P-' 23 19 85 

X a — 23 ‘ 19 05 9° 

•04 13 95 

for x a —22*699 P = *23 19 85-*699 x ( 04 13 95) 

P = - 2030. 


Thus 
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We can now find the probable error of P. Pearson 1 has 
shown that 

(t p = lcr x i .{ P g (x*) - P q - dx 4 )} 
and * X8 * ={2(q-i) + q/H + q{q - i)/N } 

where q = number of cells and H = harmonic mean of expected 
frequency. 

In the present case, <7 = 19, N — 200, q/H = 4-4137. 

Hence, oy* = 42 , i237. Giving ^2 = 3-245 


also 

thus 

we get finally, 


P 19 = *2030 and P\i — *1226 
(T p = 0*2609. 

P =*2030+1760. 


The chances are 4 to 1 against its being a random sample. 
In other words about once in five trials we would get worse 
fits than this. The probable error of P is .large. Still the fit is 
not very bad, for odds of 4 to 1 cannot be considered excessive. 

We notice that the contributions of the terminal ranges to 
X z is heavy, being 3*265, 1*504 and *482. Combining the two 
terminal groups at each end we find x 4 = 18*482, and n' 17. We 
get P = *2978 which gives a decent fit. In three trials out of ten, 
random sampling would give us worse fits. 


Table 3. 

Mean = 1656*25 mm. Unit of grouping =50 mm. 

S.D. = 67*3843 mm. 


Stature in mm. 

Observed 
Value 
m '. 

Theoretical 

Value 

m . 

( m '— m ). 

( m '— m) 2 

m 

Beyond 1530 

8 

6*0993 

1*9007 

*5902 

1530—1580 

14 

19*6830 

5*6830 

1*6408 

—1630 

45 

43*9045 

1*0955 

*0231 

—1680 

60 

57*8639 

2*1361 

•0788 

—1730 

48 

45*0741 

2*9259 

•1800 

—1780 

20 

20*7464 

0*7464 

•0268 

Beyond 1780 

5 

6*6292 

1 *6292 

*4002 


200 

200-0004 4 

1 

»'=7 

1 

* 2 = 2*8399 


From Tables by interpolation, we get 

P — 0*82 65 83 + *28 86 86 

the probable error, is large, but a high valhe of P is not improbable. 

The fit is now excellent. In 83 trials otit of 100 the fit 
will be worse than this. We conclude therefore that with 50 mm. 
grouping, the Gaussian curve is quite adequate for purposes of gra¬ 
duation. With this unit of grouping we may then safely investi- 


1 Phil. Mag. Vol. Lll, 1916, pp. 369-378 





1922.] P. C. Mahalanobis : Analysis of Stature. 39 

gate the statistical properties of the general population . l In 
subsequent analysis we have for this reason always adopted 50 
mm. as our unit. With finer groupings we are likely to obtain 
mere individual peculiarities of our sample which may not have 
any connexion whatever with the properties of the general popu¬ 
lation. 

We shall try the effect of other values of Mean and S.D. on 
“ Goodness of Fit.” 

With 20 mm., M = 16 56 2938,' S.D. = 67-13 25 
« / = 19 x a = 25*59 42, P= ’io 98 81 

Only once in ten trials the fit will be worse. The end con¬ 
tributions being rather large, we again combine the terminal, 
frequencies and obtain a much better fit. 

n'— 17, x* = 2i-2o 72, P — -1712 

That is once in six trials we will get a worse-fit. 


Tabde 4. 

Summary of “ Goodness of Fit .” 


-r~ 

Mean. 

S.D. 

n '. 


P. 


Unit oi 

: grouping =2 

j \ 

10 mm. 

1 


1656-29 38 mm. 

67-13 15 

19 

25-59 42 

•10 98 81 



17 

21 ’20 73 

.17 12 

1656-29 38 

67-38 49 8 

19 

22-69 9 

•20 30 09 



17 

18-48 2 

•29 72 74 


Unit 0 

E grouping =; 

1 

;o mm. 


16 56-25 mm. 

69-00 

1 

! 7 

3’47 

•75 67 24 

16 56-51 

67-21 95 

1 

7 

2-93 82 

•81 56 98 

16 56-25 

67 47 5 

7 

3-02 09 

•80 65 85 

16 56-25 

67-38 49 8 

7 

2-77 88 

•83 33 98 


20 mm. gives a fit of about • the same order in each case.. 
Even with such fine grouping, we get an indication that Gaussian 
distribution is not impossible, but we cannot assert that the normal 
curve is fully adequate. 

With 50 mm., the fit is excellent in every case. Even with 
the highest observed value of S.D., namely 69*00 mm., we 


1 This is the reason why 50 mm. is selected as our standard unit of grouping.. 
For purposes of comparison. See page 21. 
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get P greater than *75, i.e. in three cases out of four, a random 
fit will be worse. Thus we see that the effect of different units 
of grouping (in calculating moment coefficients) on the Goodness of 
Fit is negligible. 

We must note however that the Goodness of Fit is a much 
more sensitive criterion than P.E. in judging the accuracy of a S.D. 
We notice that with S.D. = 67*385 (the value finally adopted) P is *83, 
which is substantially better than P = *75 with S.D. = 69-00 mm. 

We conclude that with 50. mm. unit of grouping , a Gaussian 
curve is fully adequate in every way} 

Note on. the Limits of the Unit of Grouping. 

In section 1 we saw that up to a certain unit of grouping 
which in our case was 50 mm., the effect of grouping on the 
frequency constants were negligible. Let this upper limit of 
grouping be h m On the other hand, in the present section, we 
have seen that there is a lower limit of grouping for which the 
goodness of fit is satisfactory. Let this lower limit be hi In our 
case, it is again 50 mm. 

Evidently, the size of h m and h t} both depend on the size 
of the sample. If the distribution is truly Gaussian, then these 
should depend only on the size of the sample and the S.D. It* 
will be extremely useful to obtain even a rough idea about h m and 
hi for any given size of sample. 

' We can study the problem empirically. We must remember 
Bernouilli’s law which requires that accuracy should depend on 
the square root of the total number of measurements. As the 
simplest alternative we can try, if N is the total size of sample and 
* A and B are constants, 

h m = AVN and h t = B/VN 

In our case we have, h m = 50 mm. and hi — 50 mm. Substitut¬ 
ing, we get 

A — 50/ s /200 = 3*53 55 

B = 50 ^200 = 707*10 68 


I provisionally suggest that 

(a) In the case of Stature , in calculating frequency constants , 

the unit of grouping should be less than 3'5V / N. 

(b) In testing goodness of fit , the unit of grouping should be 
greater than 700 /\/N 

I do not of course attach much value to the numerical 
magnitudes of A and B given here,; study of a single example is 
obviously not sufficient. I. giye the above analysis as a suggestion. 


1 I his result is well brought out in the 50 mm. graph, but it is quite impos¬ 
sible to judge the goodness of fit by merely looking at a curve. 
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Adopting the above values of A and B , we get the following 
table :— 


N 

h m 

h 

10 

II 

222 

20 

l6 

157 

50 

25 

100 

100 

35 

70 

200 

50 

50 

500 

80 

30 

1000 

no 

20 


With small samples of 10, h m is ir. Grouping for calculation 
of frequency constants is thus justified even in the case of small 
samples. On the other hand for N = 10, h t is over 200 mm. which 
shows the absolute impossibility of judging the adequacy of fit 
in the case of small samples. In fact with samples of less than 50 
(for which A* =100 mm.) it is practically impossible to test the 
goodness of fit and hence to judge the reliability of any inference 
about the general population. Even with iV=iooo, the lower 
limit is not reduced below 20 mm. Thus, discontinuities of less 
than 20 mm. may easily escape in samples of 1000. 

It should be observed that so long as h t is greater than h m , 
we cannot hope to attain great accuracy in judging the significance 
of a fit so far as the general population is concerned. We see, 
however, that with samples of 200, h m =hi — $o mm. It then 
becomes only just possible to assert anything about the population 
sar&pled with any certainty. It seems as if 200 is the lower limit 
of safe sampling for anthropological purposes (at least so far as 
stature is concerned). 


Type IV skewness, eepto-kurtosis. 

For Anglo-Indian Stature, our fundamental constants are (in 
50 mm. working units). 

Mean=i6 5679 + 3*21 3 6 mm. 


S.D.= * 

67-38 49 8+2.55 85 mm. 

V = 

4-06 72 + 

fi, = 

•06 87 56+ 07 97 81 

& = 

3‘5° 46 ± *6o 17 

Sk.= 

+ •10 53 ± *05 68 

d = 

7'09 63 ±478 18 mm. 

/' 2 = 

i- 8 i 62 94+ -12 24 91 

/ 3 = 

-•64 18 53+ -28 60 80 

t‘ 3 = 

11-56 14 03 + 1-81 91 17 


•86 66 61 (') 


1 From Biometric Table XU I (a), p. 78. 



42 


Records of the Indian Museum. [Voi,. XXIII,. 


The curve is not significantly skew. But there is distinct 
tendency towards lepto-kurtosis. 

The curve belongs to Type IV of Pearson’s Skew Curves. 1 
The probable errors of fi\ and P% are quite large, and we may 
investigate whether the P\ — Pt “ probability ellipse ” touches the 
Gaussian point G . 2 

In order to do this we must find 2 , and the semi-minor 
and semi-major axis of the “probability ellipse.” 


£, = *o6 87 56 
A> = 3*5 
= 3-6 

Pi = 3-50 46, 

Similarly 
Multiplying by 


ri77 v 2 V. 2 , = i*4 + -37 912 
i *5 + '37 9 12 
1*177 V / iV2 J = 1*55 79 91 
1-177 \/NZ 2 = 1351 71 2 
x , =-04769,we get 
semi-minor axis = '0743 


*[•4] = 1*551648 
1*6885 


*[• 5 ] = 


•13791:2 


semi-major axis = -6446 


Tracing a probability ellipse with these values and centering 
the ellipse at the point /?, = • 07 and P- 2 — 3-5 approximately, on 
the diagram on p. 66 of Biometric Tables, we find that the Gaussian 
point G falls just within the ellipse. We also note that the ellipse 
covers a small area of the Type III region. 

We conclude therefore that a Gaussian distribution itself 
is not unlikely and may be expected to give a good fit. Type 
III is not altogether impossible but as the major portion of the 
ellipse lies within the Type IV region, the lepto-kurtosis is prob¬ 
ably just significant. 3 


Comparative Data. 

Our frequency curve is approximately Gaussian in type 
The asymmetry is very slight, skewness is small and positive 
(Mode is greater than the Mean) and the curve belongs to Type IV 
with lepto-kurtosis. 

A. O. Powys 4 * has discussed distribution of stature for 
different age groups of New South Wales criminals. The author 
says, “by looking at the curves, we see that the material is 
extremely homogeneous 6 .. . the stature distribution of these- 


1 Set* Memoirs cited above in footnote on p. 16. 

2 A discussion of-these points is given by A. Rhind: “Additional Tables 
and Diagram for the Determination of the Errors of Type of Frequency Distribu¬ 
tion.” Biometrika Vol. 7 (1910), p. 386—397! 

6 The asymmetry is very slight and the distance between the Mode and 
the Mean is also quite small. On the whole there is very little to choose between 
the “ normal ” and a Type IV curve. The latter may give slightly improved fit. 

* A. O. Powys : “ Anthropometric Data from Australia,” Biometrika Vol_ 

1 (1902), p. 30. 

6 Ibid., p. 38. 
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homogeneous groups is nearly normal, but what divergence 
there is lies in the direction of Type IV 1 2 * * 5 6 * 1 In the case of 
males, the skewness is always positive and the Mode is greater 
than the Mean. 2 Powys used very long series of measure¬ 
ments extending to several thousands in each age group. The 
distribution is lepto-kurtic in every case. 

W R. Macdonell 3 finds in the case of 3,000 English convicts 
the stature curve to be of Type IV The skewness is small and 
negative and there is slight lepto-kurtosis. Mode is less than the 
Mean. 

In the case of Verona statistics* the stature of 16,203 con¬ 
scripts show significant lepto-kurtosis 5 and a Type IV distri¬ 
bution while 3,810 selected recruits show equally significant 
“ platy-kurtosis.' ’ * Both have significant positive asymmetry 
and Mode is greater than the Mean. 

J F. Tocher 7 finds lepto-kurtosis for the Scottish Insane, 
the curve belongs to Type IV, and there is small positive skew¬ 
ness, Mode being greater than the Mean 8 . For long series then, 
viz. New South Wales males, Italian conscripts, Italian recruits 
and Scottish Insane, there is agreement as to skewness—in all four 
cases it is significantly positive ; in one case, the American recruits, 9 
there is quite significant negative asymmetry * * American recruits 
also differ in showing meso-kurtosis. 10 

Charles Goring 11 in the case of the English convict found 
the distribution approximately Gaussian in type for all crime- 
groups excepting one. In the only case in which the distri¬ 
bution is significantly different from the normal, the curve is 
of Type IV with significant lepto-kurtosis and marked positive 
skewness. 

Orensteen 12 found in the case of Cairo-born Egyptians, that 
the distribution was nearly symmetrical. The criterion K how¬ 
ever is less than 1, hence the curve really belongs to Type IV 


1 Ibid-, p. 39- 

2 Ibid., p. 43. Powys mentions skewness as negative. This is probably a slip. 

55 W. R. Macdonell: “On Criminal Anthropometry and the Identification 

of Criminals,” Biom. Vol. 1 (1^02), pp. 177—227. 

* Quoted in Miscellena, Biom. Vol. 4 (1906), p. 506 and referred to by 
J. F. Tocher (see below). 

6 Lepto-kurtic curve are more sharp-topped than the normal cu ve, the 
rise being sharper than the Gaussian. 

R Platy-kurtic is “ flat-topped ” as compared to the Gaussian. 

7 J. F. Tocher : “Anthropometric characteristics of the Inmates of Asylums 
in Scotland,” Biom. Vol. 5 (1917), pp. 301. 

8 Ibid., p. 182. Tocher says, that for long series asymmetry is negative. 
He evidently means ^3. This however is slightly ambiguous and may give rise 
to confusion. I have thought it better to refer to Ske-wness in each case, which 
has its sign opposite to that of ^3, so that Mode is greater or less than the Mean 
according as skewness in positive or negative (and ^3 negative or positive). 

9 K. Pearson. Phil. Trans . Roy. Soc., Vol. 186 A (1894), p. 385 

10 Meso-kurtosis signifies about the same degree of flatness as the Gaussian, 

u Charles Goring: “The English Convict,” p. 199. 

12 Myer M. Orei|steen : “ Correlation if Anthropometrical Measurements in 
Cairo-born Natives,” Biom. Vol. XI (1915)) P* 7 1 - 
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Conclusion. 

(1) The Gaussian curve is quite adequate for graduating a 
short series of 200 Anglo-Indian measurements. This confirms 
C. D. Fawcett’s rule of normal distribution for short series 1 of 
anthropornetic measurements. 

(2) There is some tendency towards Type IV, with lepto- 
kurtosis. All long series, with the exception of American and 
Italian recruits seem to be definitely lepto-kurtic. It is therefore 
likely that stature distribution is in general slightly lepto-kurtic 
in character, but this small lepto-kurtosis does not become statis¬ 
tically significant in small samples. 

(3) Skewness is small and positive (Mode being greater than the 
Mean) for New South Wales criminals, Italian conscripts, Italian 
recruits, Scottish insane and a short series of several offenders 
among English criminals. It is negative in the case of several 
short series of English criminals, and for one long series viz. 
American recruits. For a short series of Anglo-Indians it is 
positive but is so small that it cannot be called significant. 
Hence we conclude that the small skewness of our present sample 
is not incompatible with homogeneity. 

(4) We conclude therefore that the distribution of stature in the 
case of Anglo-Indians is of the same nature as in the case of other 
samples where the material is known to be “ homogeneous .” In 
other words, the nature of distribution of stature does not reveal 
any presence of heterogeneity in the Anglo-Indian population} 


• Biometyika Vol. I (1902), p. 443. 

2 Type IV of course is absolutely no indication against homogeneity. Fora 
detailed discussion of this point see K. Pearson : “Skew Variation, a Rejoinder,” 
Biometrika Vol. 4 (1905), p. 181. 



SECTION V DISSECTION INTO COMPONENT CURVES. 


I shall, next consider the possibility of statistical dissection of 
our frequency curve. It might be possible that the sample con¬ 
sisted of two (statistically) different strains. If this were so then it 
would be possible to break up the frequency distribution into two 
component normal distributions. 

The fundamental memoir on this subject is K. Pearson : On 
the Dissection of Asymmetrical Frequency Curves.” 1 * 3 * Pearson has 
discussed the application of the theory in several * actual cases 
and 8 has given the fundamental equations in a somewhat better 
form in a paper “ On the Problem of Sexing Osteometric Mate¬ 
rial I have followed the notation of the fundamental memoir, 
excepting in one or two instances, where I have used a slightly 
modified notation. 

But before proceeding to a full discussion of the subject it 
will be useful to apply some simpler tests of homogeneity. 


Agreement of Sub-samples. 

The whole group of two hundred cards were arbitrarily 
divided into two sub-groups of ioo cards each. The Frequency 
Constants were calculated for each of these two sub-groups and 
compared. 

The unit of grouping adopted was 50 mm. in each case. 


Mean :— 

1st group of 100 
2nd group of 100 

Difference 


— 16 5875±4-64 36mm. 
= 16 57-00 + 4-94 14 

= 175 + 678 08 5 & * 


Standard Deviation :— 
2nd group 
1st group 

Difference 


= 73-26 ±3*49 mm. 

— 68*85 ±3*28 

= 4*4i +479 


1 Phil. Trans. Vol. 184A (1894), pp. 71—no. 

4 K. Pearson: “On the Applications of the Theory of Chance to Racial 
Differentiation,” Phil. Mag. 1901, p. no. 

3 K. Pearson : “ On the Probability that two Independent Distributions of 
Frequency are really Samples of the Same Population, with Special Reference 
to Recent Work on the Identity of Trypanosome Strains.” Biometrika Vol. 

10 (1915), p. 123 ff. 

Biometrika Vol. 10 (191 5 )> PP- 479 — 4 ° 7 - 

& It is well known that the P.E. of a sum or a difference Is given by square 

root of the sum of the squares of P. E. (see Yule Statistics, p. 211). 
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Coeff. of variation : — 
2nd group 

— 

4-42 12 + ’21 13 

1st group 

= 

4*15 04 ±*I 9 83 

Difference 

= 

•27 08 +'28 97 

The difference is in no 

case 

significant. 

Passing on to the other constants we get 

1st group 

— 

2*14 66 51 + *20 48 

2nd group 

— 

1-89 60 75±’i8 09 

Difference 

= 

0*25 05 76+ -27 33 

fV* — 

1st group 

— 

0-28 75 92+ -51 96 

2nd group 

— 

1*29 90 10 ±*43 14 

Difference 

— 

101 14 18 + *67 53 

u 4 "- 

1st group 

- 

12*26 96 01 + 3*04 55 

2nd group 

= 

11*91 27 23 + 2*32 19 

Difference 

= 

0*35 68 78+ .38 30 

/v- 

ist group 


*08 357 + ' 10 99 

2nd group 

= 

•24 890 +*12 04 

Difference 

— 

16 533 ±* 16 3 Q 

/v— 

1st group 

— 

3'3 2 56 + ‘65 80 

2nd group 

= 

2*66 27+ *26 09 

Difference 

— 

*66 29 + *70 78 


We conclude that the first hundred measurements are not 
significantly differentiated from the . second hundred in any way . 
Both represent <( random ” samples of the same general population. 

It should be noted however that the difference between the 
two samples of hundred each, is of the same order as the probable 
error of the difference. In one case viz. the difference is 
actually greater than its probable error. This shows that 100 is 
very nearly approaching the critical limit of “ fair (i.e. representa¬ 
tive) sampling.” [See section III, footnote 8, pp. 32-33]. 

There is grave danger of samples of less than one. hundred being 
not representative in character (at least so far as the stature of 
populations of the same order of variability as the Anglo-Indian is 
concerned). The discussion on p. 40 Section IV shows however 
that two hundred is about the lower limit for safe inferences about 
the general population. 
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Trial Solutions by "‘Tail” Functions. 

Consider a mixture of two homogeneous components. If the 
Means of these components are sufficiently wide apart, the “ tail ’ ’ 
(i.e. the terminal frequencies) on each side will represent an 
approximately homogeneous part of the component on that side. 
Or if the variability of one component is sufficiently greater than 
the other, the terminal frequencies on its own, side will give a 
fairly homogeneous ‘‘ tail,” even though the Means are not widely 
different. 

We can fit a normal (Gaussian) curve to the “tail,” .that 
is, to the terminal frequencies only, with the help of the 
“tail” functions. If the “tail” is significantly different from 
the whole sample, then the Gaussian which describes the “ tail” 
satisfactorily may be quite different from the Gaussian which fits 
the whole sample. For example if we get two “ tail ” distribu¬ 
tions which are each different from the whole distribution, and 
yet when added together reproduce the total distribution, then we 
are pretty certain that these “ tails ” each represent one compo¬ 
nent of the given sample. Even when we find only one “ tail” 
which is different from the total distribution we can always find 
the other component by subtraction from the total curve. 

This method belongs to the trial and error type. The “tail 
curves ” obtained by considering different portions of the tail, may ■ 
themselves differ. The uncertainty in the terminal frequencies 
must be considerable and as Dr. Dee observes, “ the chief weakness 
of the method, besides the assumption of the Gaussian, often 
quite legitimate, is the absence as yet of the values of probable 
errors, which must be very considerable for slender material.” 1 

For the purposes of “tail ” functions, 50mm. gives too 
broad groupings. Hence I have found it necessary to work with 
20 mm. groupings. 

Curtailing at 1585, we get the following : — 


1 

Group 

1 

iS8S 

-1505 

mm. 

1505 

-1545 

1545 

-1525 

1525 

-1505 

1505 

-1485 

1485 

-1465 

1465 

-1445 

Total. 

t 

Frequency. 

IO 

4 

2 

4 

. __ 

I 

1 

1 

1 

1 

2 

24 


Taking origin at end of range 1585, we get raw moments 
v\=d = 2‘20 83 33 and v 2 ' = 8*66 66 67 

/'2 = 2 ^=’37 8 99 3i 


1 K. Pearson and Alice Lee : Generalised Probable Error in Multiple Normal 
Correlation. Biometyika Vol. 6 (1908), pp. 59-68. Alice Lee: Table of the 
Gaussian Tail Functions. Biometrika Vol. 10 (1914), pp. 208-214; Biometric 
Tables, p. xxvii. 
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2* u* 

Hence + i =^j = —77 = '69 28 

From Biometric Tables XI, p. 25, we get 
4, =0*69 28 •) 

h' = o'77 71 45 ( 

+2 = 1*75 7° 30' 

Thus <r = + 2 d = 1757030x2*208333 

= 3-880107. 

Mean is at a distance A = <rA' = 3-015417 (in working units) from 
origin. 

From Table II:— n/N = -21 85 68 5 
Thus we get a normal curve of 

N = no individuals 
Mean =1645-3 mm. 

S.D. = 77-6 mm. 

Curtailing at 1605 we get a fresh table:— 


Group 1 

1605 

1585 

mm. 

-1565 

-1545 

-1525 

-1505 

-1485 

-1465 

-1445 

Total. 

I 

Frequency 

12 

10 

4 

2 

4 

1 

X 

2 j 

3 6 


Calculating f ‘ raw ” moments about end of stump (1605 mm.) 
we get 

v/-d = 2-30 55,56 »V = 9’47 22 22 

giving corrected ^ = 371 2 3 89 

Thus *i = --=9-64 45 31 

V { 

From Biometric Tables XI, p. 25; we get by interpolation 

+1 = 0-64 45 31 
V = 0 ’44 33 
+2 = 1*52 3 2 67 

Thus ct = + 2 ‘^ = 3 * 5 ii 747 (in working units) 

or a = 70-2349 mm. 

Mean is at distance 

^ ,<r = ‘4433 * 70-2349 mm. from 1605 mm. 

Thus Mean = 1635-14 mm. 
and n/N = -32 27 64 2 (from Table II). 
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We finally get the following for the shorter end of the frequency 

distribution , T 

N -112 

Mean = 1635* 14 mm. 

S.D. = 70*23 mm. 

This gives a ‘‘shorter ” group differing in the average stature 
but with about the same variability as the total sample. 

Let us now turn to the taller end. 

Curtailing at 1705, we get 


Group 

1705 

-1725 

-1745 

-1765 

-1785 

-1805 

-1825 

-1845 

-1865 

Total. 

Frequency 

00 

oi 

IO’O 

5*0 

10-0 

2*0 

O 

1*0 

ro 

47*5 


With origin at 1705, raw moments are 
v/ = 2 ’00 

j/ 2 , = 6 73 4 2 ir > leading to h ~ 2 ‘73 4 2 11 
♦ 1 =0 68 33 
Thus h' =0*70 71 

\|/g = i*68 18 

and we obtain 

N = 198 

Mean = 1659*02 mm. 

S.D. = 67*27 mm. 

which is practically identical with the whole sample. 

Thus the “ taller ” end seems to represent a homogeneous sample 
of the whole group , and starting from the taller end, we do not succeed 
in breaking up the given frequency distribution into two normal sub¬ 
groups. 

The “ shorter ” end gives a pseudo-component. I shall show 
later on, when we consider the question of age-differentiation that 
the shorter tail represent?approximately the smaller age groups. 

Asymmetrical Dissection. 

We have seen that our frequency curve is slightly asymmetric. 
As Pearson observes, 1 “the asymmetry may arise from the fact 
that the units grouped together in the measured material are not 
really homogeneous. It may happen that we have a mixture of 
2.3, .. n homogeneous groups, each of which deviates about its 

mean symmetrically and in a manner represented by the normal 
curve.” 


- Karl Pearson: "Contributions to the Mathematical Theory of Evolution 
I. On the Dissection of Asymmetrical Frequency Curves,” Phil. Trans. Roy. 
Soc., Vol. 185A, 1894, p. 72. 
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Thus an asymmetrical frequency curve may be really built up 
of normal curves having parallel but not necessarily coincident 
axes and different parameters. The object of the present section 
is to discuss the possibility of splitting up our asymmetrical fre¬ 
quency curve into two component normal curves. 1 

Pearson gave necessary mathematical formulae* for this pur¬ 
pose in his memoir of 1894. The solution depends on finding the 
roots of a numerical equation of the ninth degree, and the arith¬ 
metical calculations are extremely laborious. Pearson has dis¬ 
cussed the application of the theory in several actual cases. 8 

Let bs, and be the moment-coefficients, M the mean 
and N the total of the given frequency curve. Let m V) w 2 , be the 
means, <r,, <r 2) the standard deviations and n l} n 2 the totals of the 
component curves. 

Then if h is the unit of grouping 

m ] — M + y { h and w 2 = M + y 2 *A 

Also, taking h — 1, we have 

<T \ i =^z-h l z/y%-\P\y\ +P2 

=p 2 - My 1 - ^|72 + Pi 


y\-y-i 


y\-y-i 

Pi=yi + yz, Pi=y\-rz and p z =p x .p % 

\ = 9f x 2* ~ 3l\> ^b = 3°f x zt x S~3l l b 

. _ 2/* 3 3 — ^ bps' ~ S^P£ 

3 + 2 fc* 

<k Hence, so soon as p 2 is known, p\ = p z lp. x can be found, and 
then y l and y 2 will be the roots of :— 

y L ~P\y + Pz=° 

The equation for finding p 2 is one of the ninth degree:— 

24 Pi - Z%hPz + 3^sPi - (24/*3 A 5 - loX *)P* - ( 148 /^ 3^4 - 2 * b)pf 

+ (288/A3 4 — 12^4X5^3 — A 4 8 ) p 2 S * * + (24/*3 3a 5 — 7M’3^4 2 ) Pz 3 ‘ 2 ’ l x ^ bP % ~ 2 4/ x 3 6 = O 


Let 

Also 

Then 


1 Ibid., p. 72. “ There are.reasons, indeed, why the resolution into two is of 

special importance. A family probably breaks up into two species, rather than 
three or more, owing to the pressure at a given time, of some particular form of 
natural selection Even where the heterogeneity may be three-fold or more, 

the dissection into two is likely to give us, at any rate, an approximation to the 
chief groups.” 

2 The fundamental formulae have been expressed in a slightly modified form 
in terms of the 0-constants in a recent paper “ On Sexing Osteometric Measure¬ 
ments.” Biometrika Vol. io, 1915, pp. 479—487. 

3 Pearson: “On the Applications of the Theory of Chance to Racial 

Differentiations," Phil. Mag. 1901, p. no. 

K. Pearson : “ On the Probability that two Independent Distributions of 

Frequencies are really Samples of the Same Population, etc.,” Biometrika Vol. 

10, 1915, p. 123 et seq. 
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In our case we have, for 50 mm. unit of grouping, 

/u 3 = 182 10 42 

^3= - o *47 78 43 77 

/u 4 = 11-94 16 22 08 

n= - 7*78 23 

Thus - 5'.97 9 1 20 55 

A 6= “ 275 83 07 24 

After some laborious arithmetical calculations 1 we find the 
fundamental nonic :— 

P2 + 6'97 56 40 64/>*■»+ 074 24 99 30/>2® + T. 3’57 76 59 77 £a 6 

+ 7*82 66 28 24/> a 4, +17-63 90 36 20pJ i — 1*17 49 41 i3/> 2 * 

— 0-40 73 08 98^2 — o*oi 19 04 62 = 0. 

I next form the nine Sturm’s auxiliary functions, retaining 

four figures in the decimal. 

/,(*)= 9/> 2 8 +48-82 95^ + 2-05 5o/> a 6 + 67*88 83^2* + 31 *3° 6 5^/ 

+ 52-91 71^- 2-34 99Pz -0-40 73' 

/. 2 (^)=-i-55 I2^ 2 7 -o-ii 42 p 2 6 — 6*03 46^2 6 -4*34 8i£ 8 *-ii75 93^ 

+ 0 91 38^ 2 ' 1 + o*36 20p<i +o-oi 19 

/ 3 (^)= -13*86 58/>g 6 + 2i‘3i52/>2 6 —4*68 37^-36-21 8o£ a 8 -54*86 28^* 
+ 2*28 6 o£ 2 +0*40 73 

/*(*)= + I *66 93/>2 6 -o*54 78/>2 4 ~o- 90 53p0 3 “9' 2 2 89/> 2 4 + o*09 56/> 2 

+ 0-06 15 

/ 5 (^)= + 670 i8£ a 4 +10378 45^2 8 —38*61 84^-1-83 67^2 + 0-21 04 
fM=- 417-52 59^ 2 3 +160*89 II / > 2 2 + 7* 18 96^2-0-89 ,03 
/ 7 (^)=-2*48 49/> 2 * + o*oi 94^2 + 0*01 64 

A(^) = — 5*66 47^2-0-15 

/<>(*)=- o-oi 41 

We can now find the number of real roots from the changes 
of sign in the Sturm’s functions. 


/(*> 

+ 00 

* + 

0 

— 00 

/](*) 

+ 

— 

+ 

/ 2 (*) 

— 

+ 

+ 

/s(*) 

— 

+ 

— 

fM 

+ 

+ 

— 

/e(*) 

+ 

+ 

+ 

/e(*) 

— 

— 

+ 

A(*) 

— 

+ 

— 

/s(*) 

— 

— 

+ 

/#(*) 

— 

— 

— 


1 My best thanks are due to Prof. J. M. Bose M.A., B.Sc. of the Mathematics 
Department of the Presidency College, Calcutta for his kind help in checking the 
arithmetic in many places. 
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There are 3 changes of sign with * = + 00, 4 changes with x = o 
and 6 changes with x — -00 Hence there is 4-3 = 1 real positive 
root and 6-4 = 2 real negative roots. 

By trial I locate the positive root between o and 1, and the 
two negative roots between o and — 1. 

I try the following successive approximations by Horner’s 
method. 

/( + 0*2) =+-0177 /( + 0 -I 5 ) =-*0379 

/(+ 0*18) = -*oi 17 /( +0*187) =-*oo 09 

/( + 0*188) = + -00 02 /( + 0*1878) = - *oo 002 


Thus we can take the positive root, p%— +0*1878 
For the negative roots I try 

/( o) =-*oi 19 /(— *5) =-2*27 75 

/(_*25)= — *44 88 /( — *oi) = — *oo 80 

/( —*i) = + 00 01 


Root is near — *i. I try higher approximations, now retaining 


eight decimal figures. 

/(-•i) 
/(-• 101) 
/( -* 1001 ) 
/(-•I 003 ) 

/(— 1002) 


+ 0 00 00 84 36 

— o*oo 02 54 15 
+ *00 00 51 06 

— *oo 00 44 78 
+ *oo 00 14 65 


Thus Pz = —*1002 is another root. 
Again 


/(- 

•05) 

+ *00 

34 



/(- 

* 01 ) = 

— *00 

80 



/(- 

•03) 

— *00 

12 



/(- 

•04) 

+ *00 

14 



/(- 

•034) 

— .00 

00 

97 

79 

/(- 

•0343) = 

— *00 

00 

17 

84 

/(- 

•'0344) = 

+ *00 

00 

08 

69 


Thus pz = —*0344 is the third root. 

It should be observed that if the material is a real mixture of 
two true normal components, then the mathematical solution 
would be theoretically unique . In practice, however, a statistical 
curve may be the sum of two asymmetric curves, and hence we 
must not be surprised if more than one solution is given by the 
present method of dissection. Bach root of the fundamental 
nonic gives one distinct mode of dissection. 


Case 1. 

pz = + 018 78 
Pc, — 5‘28 28 44 

P\ = P z!P 2 = -28*11 01 59 


Then, 
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Hence y x and.y^ are roots of 

y 2 + 28*13 OI7 + OT8 78 = 0 

We get 

y, = — O'OO 665 
y 2 =-28-12 345 

We obtain, finally, for the first component , 

< r ^— 1*94 08 23 


n, = 


= i '39 3 i 

28-12 345 


.200 


and 


28-11 680 

= 200*0473 = 200, to the nearest integer, 
^ = 1655-91 75 mm. 


The second component is given by 

o/= - 285-64 89 43 
n 2 = - 0-04 73 

m % — 250*08 mm. 

The second component has o- 2 negative , and is thus imaginary. 
Hence dissection into two real components is impossible in this case. 
The first component, which is the only real component, gives 
practically the whole of the given sample. The total frequency 
of the second component is only — *04 73 and is quite negligible. 


Case 2. 



p % — — 0*10 02 

We find 

p = + 1*21 09 58 36 

and 

/>, = -12-08 54 13 

Thus 

y‘ 2 + I 2 *o 8 54 137 — -IO 02 = 

and 

y x = + *00 82 85 


y 2 = — 12*10 19 90 

We get for the first Component, 

n x — 199-86 31 
< ti - 1-74 10 46 
<ti = 1-31 94 87 

m { =1656-66 42 mm. 

The second component is 

n. 2 ~ + 0-13 69 

<r 2 = -29*03 96 23 71 

m 2 = 1051*98 mm. 

We again find that the first curve gives practically the whole 
of the given sample, while the second is imaginary. 
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'P 2 


-0*03 44 

Whence 

P 3 

= 

+ 0*17 10 76 


Pi 

= 

- 4*97 3 i 40 

Thus y 

1 + 4*97 

3 i 

407-*03 44 = 0 


7 \ 

= 

+ *oo 69 075 

First component 

72 


-4*98 00 475. 


Mean 

= 

1656*59 54 mm. 


n \ 

= 

199*72 3 



= 

176 61 09 

Second component 


1*32 89 50 


Mean 

— 

1407*24 76 mm. 


n 2 

= 

+ - *27-7 



= 

+ 16*59 °3 29 


The second component is real this time, but its frequency 
being only '277, it is again negligible. The first component gives 
practically the whole of the distribution. 

It will be seen that first solution (^ = *1878) gives the fre¬ 
quency curve as the difference of two normal curves. “ The prob¬ 
ability curve, with positive area, may possibly be looked upon as 
the birth population (unselectively diminished by death). The 
negative probability curve is a selective diminution of units 
about a certain mean; that mean may, perhaps be the average 
of the less fit.” 1 In our present case, however, the negative 
component is imaginary. Hence we conclude that the real 
component is describing the general population with sufficient 
accuracy. 

In the case of the second solution (/> 4 = —*1002) the second 
component, though now additive, is still imaginary. The mean 
is at 1051*98 mm. This component may be interpreted as repre¬ 
senting a “ tendency * ’ towards the presence of a small propor¬ 
tion of dwarfs. 

This tendency becomes more prominent in the third solution 
ipz= —*0344). We find that the second component, which is addi¬ 
tive and real, definitely represents a <( dwarf ’ * distribution with 
an average stature of 1407*24 mm. The proportion, however, is 
extremely small. It is only 0*14% and can be safely neglected in 
samples of 200. In larger samples of over a thousand, we should 
not be surprised to get a few dwarfs. 

So far as the present analysis goes we must conclude therefore 
that it is not possible to break up our given curve into two real 


1 Pearson, Phil. Trans. Roy. Soc., Vol. 185 A, 1804, p. 76. 
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significant component distributions. The only sign of differen¬ 
tiation perceived so far is a tendency towards the presence of a 
very small proportion of dwarfs. 

Symmetrical Dissection. 

We have already seen that /?, (which measures the deviation 
from symmetry) is not significantly different from zero in our 
present case. In other words, within the limits of probable 
errors it is quite possible to look upon our curve as a symmetrical 
one. “ Another important case of the dissection of a frequency 
curve can arise, when the frequency curve, without being asym¬ 
metrical, still consists of the sum or difference of two compo¬ 
nents, i.e. when the means about which the components groups 
are distributed are identical. This case is all the more the inter¬ 
esting and important, as it is not unlikely to occur in statistical 
investigations, and the symmetry of the frequency-curve 'is then 
in itself likely to lead the statistician to believe that he is dealing 
with an example of the normal frequency-curve.” 1 

Pearson also notes that ‘ ‘ symmetry may arise in the cqse of 
compound frequency curves, even without identity of the means 
of the components. In this case, for two components, we should 
have for different means, equality of component group totals and 
their standard deviations. This equality seems less likely than 
equality of means and divergence of totals and standard devia¬ 
tions.” 2 

Pearson then shows that for this second type of symmetrical 
dissection (i.e. divergent means) a necessary condition is that 3M 2 a 
should be greater than /* 4 , that is /? 2 should be less than 3, or the 
curve should be platy-kurtic. But we have seen that our curve 
is lepto-kurtic (i.e. 3/*/ is less than ju 4 ), hence this type of dissec¬ 
tion is impossible in the present case. 

I shall now discuss the possibility of the first type of symme¬ 
tric dissection. The fundamental equations are given in the 
Memoir cited, p. 90. I shall slightly modify these equations in 
order to express them in terms of the ^-variables. 

Tet N, n Vy n. X) represent the totals and 2, <r, and °> the 
standard deviations of the compound and the two component 
curves respectively. Then, as Pearson has shown, the solution is 
given by 

N # 2 = . N 

W\ —w . 2 w { — w% 

= — where m 2 = 2* 

and Wy and are the roots of 

(/*4 - 3Pi)w % + (/V'4 ~ “ $HH) = 0 


* Karl Pearson: “On the Dissection of Symmetrical Frequency Curves,” 
Phil. Trans. Roy Soc., Vol. 185A, 1894, p. 90. 

2 Ibid., footnote on pp. 90-91 
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This equation involves We can however transform this 
equation to the 0-variables. 

Dividing throughout by m 2 4 , we get 

Vm 2 * / H* 5 \M 2 * M**/Ms 15VM2 4 J M 2 8 / 

But 02 = '*V/*2 2 and ^4 = ^/m 2 3 

Changing to the 0 -variables and putting x = w! fii we get 


( A “ 3) 


t 2 , 5 ft-ft T 5 ft' 1 - 3 ft 


Thus i(^- 5 fe)±-v / VT(^- 5 « a )^T t T(fe- 3 )( 5 ^- 3 ^) 

M2 2(^ 2 -3) 

The condition for a real solution is that 

*<04“ 50 2 ) > vWA - 502)" + t 5 (0a- 3)(50 2 a - 304) 
Squaring and substractirig 

o>/t( 02 - 3 )( 502 4 - 30 J 

Pearson has shown that it is necessary that w Y and w . 2 should 
be of the same sign. 

The necessary condition for real solution becomes:— 

For lepto-kurtic curves, 0 2 —3>° or 02>3, it is necessary 
that 30 4 should be greater than 50 2 a . 

For platy-kurtic curves, 02 — 3 < 0 he. 0 2 < 3 , the condition 
is that 50 2 a must be greater than 3 0 4 . 

With ungrouped distribution it is almost impossible to find 0 * 
directly. We can however find 0 4 in terms of 0 L and 02, from 
Table XUI (6), p. 78 of Tables for Biometricians and Statisticians. 1 


We have, 


For 02=3*5 


0i = -06 87 56 
02 = 3*50 46 

Q 0 68 756 _ 

04 = 2 3*72 89 +--—x [2-0142] = 25-11 37 

68 

31-00 +--—x [10*766] = 28-40 23 

100 000 


02 = 3*50 46 04 = 25-ii 37 + 

= 25-23 60 


5000 


x [13*28 86] 


We have 0 2 greater than 3, and 30 4 greater than 50 2 a hence 
we shall obtain a real solution. 

The quadratic is 

•50 46**-1*54 26% + 0-95 31 27 = 0 


* If- Pearson : “ Skew Correlation and Non-Linear Regression ”, p. 8 
(Drapers Company Research Memoirs), 



1922.] 


P. C. Mahalanobis : Analysis of Stature. 


57 


The solution is given by 



Since 

^1 = 2*19 75 
w x = *66 89 
ft 8 =r8i 62 


We get 

O’ | = 1*48 24 


o- 2 = -81 78 63 


= 74-12 mm. 

= 44*89 mm. 

And 

„„ _i -*4 73 

n, =-— 

1-52 86 

200 

‘38 13 

n 2 = ——x 200 
1-52 86 


= 150-11 


= 49-89 


It is thus possible to break up the curve into two normal 
curves with the same Means but widely different Standard Devia¬ 
tions. It will be observed that nearly three-fourths of the sample 
has got a greater variability, while about one-fourth seems to be 
a very stringently selected group. This particular solution may 
be only a peculiarity of the sample and may have no reference to 
actual fact so far as the general population is concerned. A 
calculation of the probable error of /? 4 may throw some light on 
the question. 

Pearson 1 gives the percentage variation of /? 4 to be 23*3 in a 
sample of 500. Multiplying this by 

V / g^/200=,/2-5 , 

we get the percentage variation in a sample of 200 to be 36 84. 
Hence the probable error in the present case is so large as ±9*28. 


We thus have 


&=25*236 ±9*28 


If we take our actual value of £2 = 3.5, the necessary condi¬ 
tion for a real solution is that £ 4 must be greater than 20*42. If 
the value of £ 4 for the general population is less than 20*42 
(with a value of / 3 2 = 3*5) then the present method of dissection 
will fail. 

This limiting value is only 4*82 less than the value of £4 in 
the sample, while the probable error is±9*28. It is therefore 
not at all unlikely that £ 4 should be less than 20*42 in the general 
population. We conclude therefore that it is not unlikely that 
the possibility of this particular type of dissection is only a pecu¬ 
liar property of the sample and has no reference to actual fact in 
the case of the general population. 

Hence we are not justified, on this evidence alone, in conclud¬ 
ing that the sampled population is heterogeneous in character. 

Note added on the 27th November, 1920. 

In view of the great importance of the question of hetero¬ 
geneity I thought it desirable to consider this question in greater 


J K. Pearson: “Skew Correlation and Non-Linear Regression”, p. 8 
(Draper's Company Research Memoirs). 
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detail. I calculated the grouped moment-coefficients directly 
upto Mr with 50 mm. as the unit of grouping. I find 


Thus 


V 2 — 

1-82 

10 

42 


H — 

1 

0 

$ 

00 

43 

77 


11-94 

16 

22 

08 

t* 6 = 

- 778 

23 



H = 

12974 

38 

42 

CO 


and 


P%= 3'6oi 

& = '21-474 


Since fit is greater than 3, it is necessary that 3/84 should be 
greater than 5/??.. Actually we find 

3£ 4 =64-42 20 

while 5/3/-= 64-83 60, so that 5/8/ is > 3 / 3 * 


Thus no real solution is possible in this case. But we must 
note that there is some tendency towards a solution of this type. 
I do not propose to draw any inference from this result. I have 
not yet analysed the other frequency curves and so I am not in 
a position to either confirm or refute this tendency towards a very 
special type of splitting up. 1 


Goodness of fit with Sum of Dissected Components . 

First component :— 

Mean stature = 16 56-79 mm. 

S.D. =0-1 = 74 ' 12 mm. 

Total=«, = ‘150 

Second component :— 

Mean stature = 16 5679 mm. 

S.D. =o- a = 4°' 8 9 32 mm. 

Total = % = 50 


1 

First 

Component. 

11 1 
Second ; 

Component. ! 

I+II (Total) 
Theoretical 
m '. 

Observed 

m . 

1 

m in m '. 

1 

(m—m') 2 
m ' 

6-53 48 

0.04 71 ! 

> 6-58 18 

8. 

1*41 82 

•30 55 

15*97 53 

1 -46 22 

I7-43 75 

H 

3’43 75 

•67 77 

31*31 52 

11-29 66 

42-61 18 

1 45 

2-38 82 

•13 07 

39-43 29 

22*93 *6 | 

62-36 45 

60 

2-36 45 

•08 96 

32-23 82 

12-42 61 

44-66 43 

48 

3 - 33 57 

•24 91 

17-26 74 

i-77 15 ! 

19-03 89 ! 

20 

*96 11 

•04 85 

7-23 61 

•06 47 

I 

7*30 08 

5 

2-30 08 

•72 46 


i 

1 

n '— 7 > 



* 2 = 2-22 57 


1 Since going to press, I have obtained expressions for the Probable Errors 
of the Component Frequency Constants, which confirms the non-significant 
character of the dissection in the present case. I hope to publish these new 
formulae for Probable Errors at an earl}’ date. 
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Thus, P = -89 46 80 
With the single normal curve, we had P = '82 65 83 

Difference ='06 80 97 

Thus there is an improvement of 8*2% in the fit. This is 
satisfactory. But, in view of the discussion of probable errors 
perhaps this is not sufficient* to warrant us in asserting that the 
possibility of the present type of dissection is unmistakeable evi¬ 
dence of heterogeneity of the material. 



SECTION VI. DATA FOR COMPARISON. 
Source of the Material. 


I have collected material from -many different sources. In 
1897, K. Pearson 1 * gave the coeff. of variation for 1000 English 
middleclass men, 390 Bavarian men, 284 French (from statistics 
given in “Memoires de la Societe d’Anthropologie de Paris,” 1888) 
and also some data for American school children (from the years 
6 to 10, taken from Porter’s “ Growth of Saint Eouis Children ”). 
I have retained his French and German data but have substituted 
corrected values for Englishmen given by Pearson in a later 
paper. I have omitted the children as being all under the age 
of 10. 

Pearson also reduced statistics for U.S.A. recruits * and gave 
final figures for his family data 3 * * in Biometrika in 1903. His 
famih r data consists of 1078 records of middle class English fathers 
and sons. 

Powys* gave the heights of 2862 male criminals from New 
South Wales, distributed into different age-groups. I have select¬ 
ed the total variability 6 of the whole group, for in our Anglo- 
Indian data men of all ages are present. Powys considers his 
data to be “ extremely homogeneous.” 6 

In 1901, W R. Macdonell 7 discussed the measurements for 
3000 English criminals. He also calculated the coeff. of variation 
for 1000 Cambridge undergraduates. 8 

Raymond Pearl 9 has calculated variabilities of stature for 
416 Swedes, 475 Hessians, 266 Bohemians, and 365 Bavarians. 10 
The measurements were all taken on dead bodies and the coeff. of 
variation are 4*oo9±*094, 3 ' 954 ±‘H 7 ,4‘323±*i27 and 3'838±*096 
respectively. 

Blakeman 11 has analysed a short series of 117 English males 
who died in hospitals. The coeff. of variation 12 for stature is 


1 K. Pearson : “ Chances of Death, ” Vol. 1, pp. 294—296. 

* Phil. Trans. Roy. Soc., Vol. 184A, p. 386. 

3 Biometrika Vol. 2 (1903), p. 370; K. Pearson and Alice Lee : “ On the 

Laws of Inheritance in Man,” pp. 357—482. 

* A. O. Powys : “ Anthropometric Data from Australia,” Biometrika Vol. 1 
(1901), pp. 30-49. 

6 Ibid., p. 44. « Ibid., p. 38. 

7 W. R- Macdonell : “On Criminal Anthropometry and the Identification 
of Criminals,” Biometrika Vol. i (iqoi), pp. 177—277. 

8 Ibid., p. 189. 

9 Raymond Pearl : “ Variation and Correlation for Brain Weight,” 

Biometrika Vol. 4(1905), pp. 13-104. 

10 Ibid., p. 23. 

U J. Blakeman : “ A Study of the Biometric Constants of English Brain- 
Weights, and their Relationships to External Physical Measurements,” Biometrika 
Vol. 4 (1905), pp. 124-160. 

11 Ibid., p. 126. 



[Vol. XXIII, 1922.] P. C. Mahalanobis : Analysis of Stature. 61 


4 * 55 ±'20. Blakeman believes 1 the "increased variability in 
stature to be due to the measurements being taken on the corpse 
and not on the living subject.” He mentions further 2 that the 
average V for males in Pearl’s data is 4*11. 

I have thought it best to omit the above series of corpse 
data for purposes of comparision. It will be observed that the 
variability is in each case considerably higher than the average 
variability (which is about 3*6) obtained by omitting them. Thus 
the only effect of including the "corpse” data would be to still 
further increase our average variability. We may further note that 
in most of the above cases, the variability is even higher than the 
variability of our Anglo-Indian data, which is about 406. Thus 
omission of the corpse date cannot affect our general conclusion 
that the variability of the Anglo-Indian series is not significantly 
greater than the average variability of stature for homogeneous 
material. 

Tocher 3 gave in 1906, a very large series of measurements 
on the Scottish Insane, numbering 4381 males. 

Schuster 4 in 1910 gave V for different age-groups of Oxford 
undergraduates. For reasons already explained I have taken 
the average variability for the whole group of 959 individuals. 
In an editorial note to the above, 5 some results for 493 Scottish 
(Aberdeen) undergraduates are quoted. I have calculated the 
coeff. of variability in this latter case also. I may note in passing 
that the different age-groups of the Oxford data do not give lower 
values of variability, in fact give slightly greater values than the 
total in many cases. 6 

Craig 1 gave the results of a very large series of measurements 
of modern Egyptians. These were classified in accordance with 
the town or district of birth. 8 The total number in each group 
is fairly large and this series gives us a very good list of variabi¬ 
lities for purposes of comparison. I have retained the separate 
variability for Aswan, omitting the total variability as the material 
is not homogeneous. 

Garett 9 has given a series of measurements of the natives of 
Borneo and Java. The majority were coolies in the employ of the 
author. Unfortunately* the number in the case of each people is 
not very extensive, and I have been only able to retain the values 


* /bid., p. 131. 2 Ibid., p. 132. 

3 J. F. Tocher: “The Anthropometric Characteristics of the Inmates of 
Asylums in Scotland.," Biometrika Vol. 5 (1906), pp. 298—350. 

4 E. Schuster : “ First Results from the Oxford Anthropometric Laboratory, ” 
Biometrika Vol. 8 (1911), pp. 40-51. 

6 Ibid., p. 49. 

6 Thus the lumping together of all age-groups cannot again affect the general 
validity of our conclusions. 

7 T. I. Craig “ Anthropometry of Modern Egyptians,” Biometrika Vol. 8 
(1911), pp. 69—77. 

8 Ibid., p. 75. 

9 T R. H. Garett: “ Natives of the Eastern Portion of Borneo and Java,” 
Jour. Roy. Anthrop. Inst., Vol. XL 1 I, 1912, pp. 60—66. 
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for Javanese (17), Banjerese (33) and Sundanese (37), as no other 
series includes more than 7 individuals. 

Joyce 1 has given figures for 25 different groups of people of 
Chinese Turkestan and the Pamirs. But again the total number 
is rather small in most cases, even the longest series including only 
67 individuals. 

Leys and Joyce * gave measurements for 38 different groups 
of people from East Africa. Some of these are foreigners. Num¬ 
bers are moderately large in some cases , the longest series contain¬ 
ing 384 individuals. 

Seligmann 8 has given measurements for 7 groups of people 
of Anglo-Egyptian Sudan. The number in each group is moder¬ 
ately large, being on an average about 50. Dr. Bowley has 
analysed the Dinka group containing 116 individuals. The abso¬ 
lute S.D. (9*66 mm.) as well as the coeff. of variation (5*4311) is ex¬ 
ceptionally high. Dr. Bowely 4 concludes from the goodness of fit 
that “ there is no indication of the mixture of two distinct groups 
with widely differing averages.” 6 

Frankly speaking, such a high value of V as 5*4311+ *24 for 
homogeneous material is extremely puzzling. We have of course 
obtained several high values of V, but in all such cases the num¬ 
bers are quite small and the P.E. quite large. One would like to 
obtain independent evidence regarding the homogeneity of the 
Dinka people. In any case, a fresh series of measurements of the 
Dinka people is urgently needed. 

Goring 8 has given extensive data for English criminals, to 
which we shall have to refer again. 

Whiting 1 has discussed 'the case of 500 English convicts be¬ 
longing to Dr. Goring’s data. 

Orensteen 8 gave results for 802 adult male Egyptians born in 
Cairo. 

Addendum. 

Dudley Buxton has recently published the Variabilities of 30 
Mediterranean and 3 Jewish races. 9 


T. A. Joyce : ** Notes on the Physical Anthropology of Chinese Turkestan 
and the Pamir,” Jour. Roy. Anthrop. lust., Vol. XLII, 1912, p. 450. 

2 Norman M. Leys and T. A. Joyce: '• Note on a series of Physical Mea¬ 
surements from East Africa,” Jour. Roy. Anthrop. Inst. Vol. XI.Ill, 1913, 
P- 195 - ' 

A C: G. Seligmann : “ Some Aspects of Hamitic Problem in the Anglo- 
Egyptian Sudan,” Jour. Roy. Anthrop. Inst. Vol. XLIII, 1913, pp. 592-705. 

* Ibid., p. 705. 

6 In the absence of any attempt at statistical dissection, mere homotyposis in 
graduation cannot be considered conclusive evidence of homogeneity. 

Charles Goring: “ The English Convict,” 1913. 

7 Madeline H. Whiting: “On the Association of Temperature, Pulse and 
Respiration with Physique and Intelligence in Criminals,” Biometrika Vol. n 
( 1915 ); PP- 1 - 37 - 

8 Myers M. Orensteen : “ Measurements of Cairo-born Egyptians,” Biometrika 
Vol. ii (1915), pp. 67-81. 

9 Biometrika , Vol. XII, 1920, pp. 92-112. 

, may note that in man >’ cases - the Coeff. of Variation has been 

calculated by me. 
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Risley 1 published the crude measurements of 87 Indian castes 
and tribes, but he did not calculate a single frequency constant 
or a single probable error. The size of sample varies from 185 to 
2, yet every average has been given equal weight on the strength 
of his authority. The averages published in his book were in many 
cases hopelessly wrong, in one instance the difference amounted to 
no less than 60 mm. 

I have just finished calculating the irequency constants for the 
whole of Risley data for Stature. I hope to publish my results 
at.an early date. Meanwhile I shall use my summary table for 
purposes of comparison in this paper. 

It should be noted that the present section was already sub¬ 
mitted to the press when the Mediterranean data reached me. 
The Risley data also had not then been reduced. Thus the 
earlier part of the present section does not include the above two 
series of data. I have retained a portion of the older work, but 
have gone over the whole ground again with the inclusion of the 
new data. 

The Caste data of Risley is substantially differentiated from 
other samples in showing a significant lower Variability, hence 
the Anglo-Indian sample is found to be significantly more 
variable than the Indian Castes and Tribes. Otherwise the 
inclusion of the new data does not upset the earlier conclusion 
that the Anglo-Indian Variability, though higher than the general 
Variability of f< homogeneous ’ ’ races, is not significantly different. 
As a matter of fact Anglo-Indian Variability is just about the 
same as the Variability of European (in a geographical sense only) 
races. 

Note on the Retention of Criminal Data. 

It may be objected that a criminal population being substan¬ 
tially differentiated from the general population, it is not legitimate 
to use criminal data for comparative purposes. We can only reply 
that if there is any fundamental anthropological differentiation 
this has not yet been proved to be the case. On the other hand 
the bulk of available statistical evidence goes to show that there is 
no such thing as a differelit criminal type. J J Craig 1 says of his 
Egyptan data, <c it may be objected that criminality in itself is a 
determining factor .of selection, but the objection does not hold in 
Egypt ” and he proceeds to explain why In the case of New 
South Wales also the same is true. There is no significant differen¬ 
tiation of criminals from the general population. 8 

As regards the English convict, we need only refer to the 
great work on the subject by Dr. Charles Goring (already cited 
several times in this paper). Goring comes to the conclusion that 
the Iyombrosian doctrine of criminal types is false. Cf Criminals as 


1 “ Indian Castes and Tribes," 2 Vols. (1904?) (Superintendent of Govern¬ 
ment Printing, Calcutta). 

* J. I. Craig: loc■ cit. Biom. Vol.,8 (1911). 
s Goring: “ The English Convict,” (1913), p. 198. 
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criminals are not a physically differentiated class of the general 
community. The physical and mental constitution of both crimi¬ 
nals and law abiding persons of same age, stature, class and intel¬ 
ligence are identical. There is no such thing as an anthropological 
criminal type. 1 ” In view of Goring’s work we may safely include 
criminal data for purposes of comparison, at least until statistical 
evidence iu support of the Lombrosian doctrine is forth-coming. 

Table 5. 

Mean Stature , S.D. and Coeff. of Variation of 100 different races. 

Note. —(1) The number immediately after the name of the race gives the 
reference of the source from which material is collected (see end of table). 

(2) Second column gives number of individuals on which the average is based. 

(3) Races italicised -were selected as more reliable. It will be noticed that the 
total number in each case is greater than 25, and the P.E. of Coeff. of Variation 

is less than *32 or */*i. 


Name of Race. 

Col. 2 
No. in 
Sample. 

Mean (mm.) + 
P.E. of Mean. 

S.D. in mm. 
+ P.E. of 
S.D. 

ioox (Coeff. of 
Var. + P.E. of V.> 

1 Segua (1) .. 

12 

1670- +57 

29*46+ 4*0 

176-42 + 24-28 

2 Digo (1) 

l 15 

1629-4 + 5-9 

33 * 78 ± 4-2 

207-32 ±25-52 

3 Nyika (1) 

j 18 

1658-1 ± 6-3 

39*37 ± 4*4 

237*43 ± 26-68 

4 Comoro (1) 1 

23 

1662-9 + 5'9. 

41-66+ 4-1 

250-49 + 24-91 

5 Kaseri (1) 

12 

1696-5 + 8-6 

43*94 ± 6-0 

259*02 + 35-66 

6 Javanese (2) 

17 

1570 * 59 + 671 

43*3 ± 6-4 

261- ±34- 

7 Kelpin (3) 

is 

1650-00 f 9-8 

44-6 + 7-0 

270-30 + 33-28 

8 Sarikoli (3) 

40 

16377 + 6*o 

44*3 ± 4*3 

270-50 + 20-39 

9 Nandi (1) 

14 

1676-4 + 8-3 

45*9 ± 5*9 

274 * 24 ± 34*95 

10 Lamu (1) 

26 

1637-0 * 5-9 

44-96 t 4*2 

274*63 ± 25-68 

11 Dolan (3) 

l6 

. 1641-1 + 9-5 

4610+ 67 

280-89 ± 33*49 

12 Muscat A rab (1) 

31 ! 

1648-4 + 5-8 

47*8 + 4*1 

289 67 + 24-81 

13 Faizabad (1) 

12 

1669-2 +ii-o 

49*2 + 7*8 

29475 ± 40*58 

14 Shilluk (5) 

14 1 

1776-0 + 9-6 

53-0 + 6-8 

298-42 + 38-04 

15 Baganda (1) 

44 1 

16647 ± 5*i 

50*3 + 3*6 

302-10+21-72 

16 Hami (3) 

! 21 

1630- + 8-3 

49*5 ± 5*9 

303-68 + 31-60 

17 Yemen Arab (1) 

20 

1 647-7 ± 7*6 

50-29+ 5-4 

305-22 + 32-55 

18 Swahili {1) 

53 . 

1646*7 + 47 

50*3 ± 3*3 

305-41 +20-01 

19 Wanyamwezi (1) 

IOI 

1764*9 ± 3*5 

51-6 + 2-4 

307-85 +14-61 

20 Nissa (3) 

9 

l602"2 ± I2’7 

49*5 ± 9*o 

308-95 +49-11 

21 Pakhpo (3) 

25 

1604-0 ± 7*6 

49*5 ± 5*4 

308-60 + 29-45 

22 Segeju (i) 

36 

1631-1 + 5-7 

50*5 ± 4 *o 

309-82 ± 24*62 

23 Chinese (3) 

20 

1667*0 + 8-5 

517 + 6-o 

310-97+33-08 

24 Banjerese (2) 

33 

1569*64+ 571 

48-61+ 4-04 

310- ±26' 

25 Niya (3) 

18 

1626-0 + 9-0 

50-4 + 6-4 

310-15 + 34-86 

26 Karnaghu-Tagh (3) 

21 

1660-5 + 8 *3 

52*9 ± 5*9 

318*57 ± 33* I 5 

27 Canal Egyptians (4) 

127 

16587 + 3-2 

54-2 + 2-3 

326-00+ 14-00 

28 Kababish (5) 

23 

1709-0 + 7-9 

56*0 + 5-6 

327-67 + 32-58 

29 Cutch (1) 

24 

1633*0 + 7-4 

54 *i ± 5*3 

33 i* 3 i± 32*25 

30 Nejmps (1) 

11 

1723*1 ±117 

57*4 ± 8-3 

333*13 ± 47*96 

31 Khotan (3) 

6 7 

1655-2 -+ 4-6 

55*5 ± 3*2 

335 * 30 ± 19*53 

32 Punjabi (1) 

60 

1683-8 +5-0 

57*2 + 3*5 

339*41 ±2089 

33 Bantu Kavirondo (1) 

24 

1692*6 + T9 

57*4 + 5*6 

339 * I 3 ± 3 T° I 

34 Minia (4) 

491 

166970 + 1-7 

56-6 + 1-2 

339-00+ 7*66 

35 Sundanese (2) 

37 

1 59 1 *30 + 6-oo 

54 * 07 *+ 4*24 

340 ±27- 

36 Kamba (1) 

128 

1656-6 + 3-4 

56*6 + 2-4 

341-92 + 14-41 

37 Turf an (3) 

72 

1662-6 + 4-5 

57 *o + 3 - 2 C 

» 342-83 + 19-27 

38 Beheira (4) 

525 

1676-8 ± 1 -7 

5*7*4 + 1-2 

34200 +07-00 


1 Goring : Ibid., p. 370. 
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Name of Race. 

Col. 2 
No. in 
Sample. 

Mean (mm.) + 
P.E. of Mean. 

S.D. in mm. 
+ P.E. of 
S.D. 

100 x (Coeff. of 
Var. ± P.E. of V. 

39 Biloch (1) 

15 

16497 

± 

9‘9 

56-6 

± 

7-0 

343 34 ± 42'27 

40 Duruma (1) 

67 

1649-2 

? 

4-8 

57'7 

f 

3’,4 

349-60 + 20-37 

41 Arab and Sivahilil(^2) 

32 

1644-6 

± 

69 

577 

+ 

4*9 

350-57 ± 29-55 

42 Giza (4) 

326 

1678-0 


2-2 

5S8 

± 

x-6 

350- + 9- 

43 Chitrali (3) 

22 

1684-5 

± 

8-1 

59'3 

+ 

5 8 

352*03 + 3579 

44 Qena (4) 

824 

1678-0 

± 

x -4 

59 -o 

+ 

i-o 

352- + 6- 

45 Beni Amer (5) 

5 i 

1643- 

± 

5 / 

5 «- 

+ 

4 ’ 

353 -ox ±23-57 

46 Girga (4) , 

610 

16777 

± 

1-6 

59-2 

+ 

I • I 

353 ± 7 * 

47 Fayum (4) 

4*3 

16720 

+ 

2*0 

59-2 

t 

x *4 

354 ± 8- 

48 Polu (3) 

i 3 1 

1644 2 

+ 

7-0 

58-3 

t 

4'9 

354*57 ± 30-37 

49 Beni Suef (4* 

3 8 4 

1662-3 

+ 

2-0 

59 ’x 

+ 

x *4 

355 * ± 9 * 

50 Gharbia (4) 

1105 

1673-3 

+ 

1*2 

59’4 

+ 

09 

355 '°°± 5 " 

51 Masai (1) 

9 i 

1700-0 

± 

4'3 

60-7 

+ 

3-0 

357-08+ 17-85 

52 Hadendoaand Amara 

1 ( 5 ) 54 

1676 

± 

5 * 

6o- 

+ 

4 

357-99 + 23-23 

53 Aksu (3) 

! 13 

16377 

+ 

[0-6 

58'5 

± 

7*5 

357-20 + 47-25 

34 Sheher (1) 

i 82 

16157 

r 

4*4 

57’9 

+ 

3 ’ 1 

358-43 + 18-87 

55 Alexandria (4) 

; 643 

i666"2 

± 

i-6 

597 

+ 

i-i 

1 3590 + 7- 

56 Kokyar (3) 

37 

1629-2 

+ 

6-3 

58-9 

+ 

4‘4 

361-52 + 28-34 

57 Giriama (1) 

, 24 

1629-7 

± 

8-r 

589 


57 

361-59 + 35-20 

58 Daqahlia (4) 

504 

1660-6 

± 

i-8 

6o-o 

+ . 

i -3 ! 

361- +08 

59 Assiut (4) 

889 

1668-9 

± 

1 '4 

l60-3 + 

i-o 

362 +06 

60 Cairo (14) 

802 

1682-9 

± 

i *4 

. r » 9'3 

+ 

ro 

364 + o6* 

61 Wakhi (3) 

19 

1680- 

+ 

8-8 

61-8 

+ 

6 2 

367-84140-25 

62• Camb. Students (n). 

1000 

1748-88 + 

1-4 

64-6 

+ 

097 

369-58 + 05-58 

63 Ajawa(i) 

16 

1652-2 

+10-3 

61 2 

t 

7’3 

370-48 + 64-17 

64 Aswan North (4) 

115 

1683-3 

± 

39 

623 

± 

2-8 

37000+ 1600 

65 Menufia (4) 

718 

16770 

± 

16 

62-5 

± 

i-i 

37 i' ± 7 ' 

66 Embu (1) 

no 

1630-1 

± 

3‘9 

6l-2 

± 

2 9 

375 - 5 o± 1707 

67 Kafir (3) 

18 

1667-8 

± 

9-0 

* 53'3 

± 

6-4 

379-54 + 42-66 

68 Manyema (1) 

42 

1667-5 

+ 

6-6 

63-2 

± 

47 

379-28 + 27-91 

69 Kikuyu (1) 

384 

1640- 


2*2 

62-5 

± 

x -5 

380-98+ 9-27 

70 Qualiubia (4) 

295 

1662-4 

+ 

25 

631 

± 

i-8- 

280- +io- 

71 Sharqia (4) 

516 

i655'4 

+ 

1-9 

637 

± 

X ’3 

382- + 8 

72 U.S.A. Recruits (6) 

25,898 

1709-4 


0-27 

65-6 

± 

0-19 

38376+ 1 15 

73 Nuer(S) 

39 

1806- 

± 

80 

70- 

± 

5 

387-59 + 29-60 

74 N.S.W. Criminals (8) 

2871 

1698-8 

t 

0-83 

658 

± 

0 58 

387-33 + 03-45 

75 Nyasa (t) 

21 

1640-0 

t 

9‘4 

63 7 

± 

7 ' 3 

390-27 + 40-61 

76 Keriya (3) 

21 

1612-5 

± 

9’3 

62-9 

+ 

65 

396-07 + 40-59 

77 Sukuma (1) 

21 

1717-0 

t- 

9'9 

67-3 

+ 

7 0 

392-01 +4280 

78 Kirghiz (3) 

38 

164a 8 

± 

6 2 

64-6 

± 

4’4 

393 - 7 I ± 30 46 

79 Somali (i) 

2 7 

1735 ’I 


7-6 

68-6 

± 

5'4 

395-25 t 36-27 

80 Suk (1) 

15 

' 1677-9 

+ ri-6 

66-3 

+ 

8-2 

395-2 +4866 

Sr Eng. Sons (9) 

1078 

17440 

± 

1-42 

69-4 

+ 

ro 

395 ± 6 

82 Eitng. Fathers (9) 

xo 78 

X 7 i 9'5 

± 

i *39 

687 

+ 

i-o 

399 ± 6 

83 Germans (10) 

390 

1659-3 

± 

2-3 

66-8 

+ 

i-6 

402-37+ 10-38 

84 Eng. Criminals (11) 

3000 

1658-1 

± 

1 6 

68*07 + 

r-2 

411 + 9 

85 Nilotik Kavirondo (1) 

37 

1729-0 

± 

T 9 

714 

± 

5-6 

412-81 + 32-36 

86 Loplik (3) 

38 | 

1695*0 

± 

6-2 

70-3 

*± 

4'4 

41474+32-08 

87 Barabra (5) 

70 ! 

1680 

± 

70 

70 

+ 

3 7 

416-66+ 23-75 

S8 Kachamega (1) 

100 1 

1668-3 

+ 

47 

69-8 

+ 

3‘3 

418-69+ 19-96 

89 Kamasia (1) 

20 

1719-8 

+10*9 

72-4 

+ 

77 

420-91+44-89 

90 Aswan South (4) 

95 

1650-6 

± 

. 4*8 

69-4 

+ 

14 

421 + 21 

91 Mastuji (3) 

28 

1666-1 

± 

7-2 

70-4 

+ 

5 'X 

422-54 +3808 

92 Korla (3) 

14 

1667-9 

± 10*2 

70-6 

+ 

7-2 

423-28 + 53-95 

93 Scot. Insane (7) 

4381 

1673-8 

± 

073 

72-1 

± 

0-52 

430 - 95 + 3"xo 

94 Scot. Total (7) 

4401 

1668-8 

± 

075 

737 

+ 

o -53 

441-40+ 317 

^>5 Bagh-jigda (3) 

12 

1647-5 

± i ro 

73-2 

+ 

7-8 

446-30 £ 6l-I7 

96 Charklik (3) 

12 

1678-3 

± 1 ro 

74-6 

± 

78 

446-28 + 6 I -44 

97 Chaga (1) 

18 

1641*6 

+ 12*2 

7696 + 

87 

468-82 + 52-70 

98 Rabai (1) 

13 

1626-1 

± 14‘5 

77 '4 

± 

0-2 

476-41 +63-01 

99 l'urkana (1) 

9 

1694-4 


86-i 

±137 

508-16 + 8078 

100 Dinka (5) 

116 

1786-' 

! 

± 

6o' 

1 

97-0 

+ 

44 

543 -I 1 + 24-04 
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Supplementary List. 

In this List actual Coefficients of Variability are given. 


Name of Race. 

Col 2 
No. in 
Sample. 

| Mean (mm.) 4 
j P.E. of Mean. 

S.D. in mm. 
+ P.E. of 
S.D. 

1 100 x (Coeff. of 
•Var. + P.E of V.) 

101 Crete, whole Island 
(12) 

318 

: 17061 +2-6 

67-5 '± 1-8 

i 

3-96+ -12 

102 Eparchies ( Selinos , 

Sphahia) (12) 

50 

[ 

1752-6 ±5-4 

57-1 + 3-9 

1 3-26+-22 

103 Albanian (12) 

140 

1693-2 +3-7 

! 65-7 +26 

1 3-88 t*iS 

104 Cyprus (whole Is¬ 
land) (12) 

585 

1687-7 +17 

| 

61-6 + 1-2 

3-644-07 

105 Cyprus (Nicosia) (12) 


16788 +3.9 

60-5 + 2-7 

3-604 *16 

106 ,, (Lapitho) (12). 

221 

1680-0 +2-5 

547 ±i-8 

3-254-10 

107 „ ( Ekomi) (12 ) 

167 

1690-5 +3-2 

6o-8 +2-2 

3 - 59 ±'i 3 

108 ,, (Levkonika) (12) 

87 

1689-8 +4-6 ; 

637 ± 3-3 

’ 377 ±’J 9 

109 Cyprus (Leukas) ( 12) 

42 

1668-o +67 

64 ’3 ±47 

386+ -33 

110 Lycian Gypsies (12). 

53 

16602 +4-4 

47*8 ± 3 *! 

2-88 4 *20 

hi Persian Jews (12) 

57 

i 643‘5 ± 5 '2 


3 ‘ 53 ± '22 

112 Yemen Jews (12) 

78 

1594 ° ±2-9 

. « 

3764-20 

113 Samarkand Jews(\2) 

100 

1664-2 43-9 


3-524-17 

114 Oxford students (13) 

959 

.1765 

66 08 4 

37439 

115 Aberdeen students 

(11) .. 

493 

1717-0 fi‘8 

59’4 ±i ’3 

3‘4595 


(1) Leys and Joyce, Jour. Roy. Anthrop. Inst.,. Vol. XLIII (1913) P- 216. 

(2) Garett, Jour. Roy, Anthrop. Inst., Vol. XLIl (1912), pp. 60-66. 

( 3 ) Joyce, Jour. Roy. Anthrop. Inst.. Vol. XLII (1912). p. 473. 

(4) Craig-, Biometrika, Vol. 8 (1911). p. 75- 

(5I Seligmann, Jour. Roy. Anthrop. Inst. Vol. XLIII (1913). pp. 700-702. 
(6) Pearson, Phil. Trans. Roy. Soc. Vol. 184A, p. 386. 

7) Tocher, Biometrika, Vol. 5 (1906-7) p. 307. 

(8) . Powys, Biometrika, Vol. 1 (1901), p. 44. 

(9) Pearson, Biometrika, \ r ol. 2, (1903), p. 370. 

(10) Pearson, Chances of Death, Vol. I, pp. 294-296. 

(11) Macdonell, Biometrika, Vol. I (1901) pp. 191. 

(12) Buxton, Biometrika, Vol. 13 (1920), p. 104 and p. 108. 

(13) Schuster, Biometrika Vol. 8 (1911), p. 49. 

(14) Orensteen, Biometrika, Vol. 11 (1915), pp. 67-81 


Table of Variabilities. 

There are several remarkable points about the Table of Vari¬ 
abilities. The material is supposed to be homogeneous in each 
case, yet we note the extreme range of variation of the coeff. of 
variability. We have 176 42 + ^24 28 and 5*08 i6±'8o 78 as our 
extreme values. 

The mean variability is very near 3*6, and one very remarkable 
fact is this, that— 

I. The more highly civilised races have greater variabilities 
than the average. 

This confirms Pearson’s result for Cephalic Index. 1 Pearson 
concludes for Cephalic Index that greater variability is a characteris¬ 
tic of the “races which have been successful in the struggle for 
existence, and at the present time are the dominant races of the 


Chances of Death, Vol. i, p.‘ 292. 
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-earth. At the same time the greater variability of the more domi¬ 
nant and civilised peoples admit of being interpreted as a result of 
the lesser severity of the struggle for existence among them. Thus 
greater variability would be an effect not a cause of the higher 
state of civilisation.'’ 

Another fact which may be gathered from the above table is 
this. The more civilised races though more variable, do not in 
any case occupy the extreme ends of the table. Thus one would 
probably be justified in inferring that a higher state of civilisation 
is not associated with extreme degrees of variability. 

We may look at the same question from a different point of 
view. The less civilised races occupy the extreme ends of the table 
more.frequently than the more civilised races. The less civilised 
races though on the whole less variable, may thus be associated 
with extreme degees of variabilities. 

II. The greater variability of-more highly civilised races seems to 
be only moderate in degree and is never excessive} 

It seems as if slightly greater variability than the stable type 
of the species is accompanied by greater adaptability and hence 
with a higher state of progress. 

Interracial Variability. 

There is another point which deserves attention. By lookiug 
at our general list of variabilities, we find some association 
between average stature (M) and standard deviation <r. 

The point which we are considering now is interracial correla¬ 
tion between Man^ for the different races.* 

If ^,=S(.ry)/A, 

then the correlation coefficient as determined by the product moment 
method, 3 is given by 

r — l l \\ /(o> (Ty ) 

where and,oy are S.D. of the two variables. 

I find, without grouping, with base numbers 1660 mm. and 
60 mm. respectively for average stature and S.D. the raw mo¬ 
ments to be:— 

For Stature V|' = 5*24 v.' -1389 48 


t •< In the .selected list (see below) this fact is not so apparent. It seems as if 
the extremely high variability of less civilised races is due to unreliability of 

. .O 

data. 

* This is quite distinct from the intra-racial (or within the race) correlation 
between errors in Mean and errors in S.D. 

In Biotn. Vol. 2 (1903), Problem IX, p. 279. is shown that 

R =fx^/((T (T N \ 

M, <r ™/\ M2 ) 

In our case, **3 is negative, hence a taller subsample of Anglo-Indians will 
show less variability and vice versa. This is actually the case with the two 
subsamples we have already considered. The si'ibsample with a higher average 
1658*75 mm. has a S.D. of 68*85 mm. as against the other with Mcanzzib^^’oo 
mm. and S.D.=73’26 . ns being small, correlation however, is very small. 

3 See Yule: " Theory of Statistics, ” p. 171. 
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Standard Deviation (base No. 60) 

v,' = 77 ^' = 92-29 

and for |/ ii , = II0 ’ 0 4 - 


Transferring to Mean, we get:— 

For Stature :— 

Mean value of Average stature =1665*24 mm. 
Standard Deviation = 36*9° 5 2 mm. 

Coeff. of interracial Variability = 2*21 62 


For Standard Deviation :— 

Mean value of Standard Deviation— 59’ 2 3 mm. 

S.D. of Standard Deviation = 9'57 5$ mm. 

Coeff. of interracial V of S.D. = 16*17 

We also have / J ii — +114*07 38 

mi T) _ + 114*07 38 

* v » <r_ 36*90 52 x 9*57 58 
= + *30 98. 

The Prob. Error ] of R is given by —~# 2 ) 

V n 

From Abac in Biometric Tables p. 19, we find for A 7 =100, 
P.F,. of R = ’ 062. 

Thus R u,<r=‘ 3098 ±*o 62 

We may now consider the correlation for our selected list of 
5-5 reliable samples. 


Stature :— 

Mean value of Average Stature =16. 63 94 54 mm. 

Standard Deviation of Average Stature— 36*59 53 mm. 
Coeff. of Variability {interracial) = 2*19 93 


Standard Deviation- 

Mean value of Standard Deviation = 59’34 53 mm. 

S.D. of Standard Deviation = 6*46 mm. 

Coeff. of Variation (interracial) — 10-89 

/*, 1= +125*889 

Thus R y . o- = + *3283 ± *082 

Selection of more reliable values does not make any sub¬ 
stantial difference. We may therefore conclude that there is a 
positive interracial correlation of about + *3 between Average 
Stature and Standard Deviation. 


1 K. Pearson and L. N. G. Filon : “Probable Errors of Frequency Con¬ 
stants ” etc., Phil. Trans. Vol. 191A, pp. 231—241. 
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Interracially, taller races are on the whole more variable than 
shorter . 

It will be noticed that the average stature of all the races is 
1665*24 and in the case of selected races, 1663*94 mm. 

The Anglo-Indians are thus slightly shorter than the general 
average of all the races. But the difference is only about 7 mm. 

In this connection it is interesting to compare the figures 
given by Tschepourkowsky. 1 He finds for 92 Russian races the 
mean value of average stature to be 1647*4 mm. and S.D. 33*3 
mm. while for Deniker’s 84 living races, the values are i 639’6 
and 55'9 respectively. 

His coefficient of interracial variation of stature is 2*02. In 
our series of 100 races it is slightly higher, being about 2*22 but is 
of the same order. 

Thus our value of interracial variability agrees generally with 
a previous value found independently by another worker . 2 3 

We can now pass on to the question of interracial correlation 
between M and V 

If y,. v it v & , y 4 are the variabilities of x h x 2 , x 3 , x± and ^ )2 > r \z 
are the correlation between x, and x it x 2 and x 3) etc., then the correla¬ 
tion between *if Xi and X i/ Xi has been shown 8 to be 

P v/(W|' 2 + v% - 2 r ] z v x v 2 )(y a * + - 2 


We get correlation between M and V — 100 crj^ by putting 


Then 


x \ = x± — M , x 3 = i and x 2 = a 


», =V V V-A = °> y i3 = y 23 = ^4 = °> 

^14 — ^41 = I > ^12 = ^ 42 " 

Thus 




• • 

+ v.?-2r ]9 v x v 3 


For the Whole Series of roo races:— 

Vj = 2*21 62 

v\ =16*17 
y l2 = +‘30 98 

Hence pv v = + * 17 87 + *065 

For the Selected Series of 55 races :— 

V\= 2*19 93 
y 2 = 10*89 


1 E. Tschepourkowsky: “Contributions to the Study of Interracial Corre¬ 
lation.” Biom. Vol. IV (1905), pp. 286—312. 

2 We may note however that the interracial variability is higher in our case. 
This implies that our sample of races is more representative in character than 
Tschepourkowsky’s. 

3 K. Pearson: ‘‘On Spurious Correlation/' Proc . Roy Soc. Vol. LX. 
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r 12 = + ‘32 83 

Vr = + 1303 ± 088 

The correlation in the latter case is scarcely significant, 
but seems to be slightly positive. 

Thus there seems to he a small positive interracial correlation 
between the average stature and the coefficient of variation. 

Assuming recent races to be more variable, the positive 
interracial correlation between stature and variability may be 
explained on the hypothesis that tallness is a recent acquirement 
of the human species. The greater variability is not merely due 
to the greater absolute size of the caller races, since the coefficient 
of variability i.e. proportional variability itself is also positively 
correlated with stature. 



SECTION VII. COMPARISON OF VARIABILITIES. 
Standard Deviation of Stature. 

(a) The Whole Series. 

Let us consider the ioo different values of Standard Deviation 
of Stature, which I have collected for purposes of comparison. 
We notice the great range of variation of the S.D. Our extreme 
values are 29*5 mm. and 97-0 mm. 

Grouping by units of 5 mm. we get the following distribu¬ 
tion :— 

Distribution of 100 S.D . of Stature. 


Group 

29 

34 

39 

44 

49 * 

54 

59 

64 

69 

74 

79 

84 


to 

to 

to 

to 

to 

to 

to 

to 

to 

to 

to 

to 


34 

39 

44 

49 

54 

59 

64 

69 

74 

79 

84 

89 

Frequency 

2 

0 

4 

6 

1 

I 3 ' 5 ' 

22 

23-5 

10 

15 

3 

0 

I 


We get 

Mean Value of Standard Deviation = 59*45 nun. 

S.D. of Standard Deviation = 9‘52 37 mm. 

P.E. of Mean Standard Deviation = 6-42 12 

We can now compare our Anglo-Indian S.D. with this Mean 
Value:— 

Anglo-Indian S.D. =67-38 mm. 

Mean value of S.D. = 59'45 mm. 

Difference 7'93 ± 6*42 mm. 

The difference 7*93 + 6*42 mm. is not at all significant. We 
can find the probability of this difference, 

x= #v,«ig-o-8 3 approximately 

From Tables II, p. 2 £(*+«) - 79 6 7 3° 6 

|(i -a) = -20 32 69 4. 

If we assume that our sample of 100 standard deviations is a 
random or representative samples then 20-3% of all “ homogene¬ 
ous” races will have a S.D. greater than the Anglo-Indians, and 
40*6% will differ more from the average value than Anglo-Indians. 

For Stature , the absolute variability (Standard Deviation) of 
Anglo-Indians is thus not significantly greater than the average absolute 
variability of homogeneous races. 
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It will be noticed that the list contains many small samples. 
It will be better to omit all samples of less than 25. Doing this 
we find that extreme values have been mostly eliminated by this 
process of selection, showing that such extreme values were 
probably in most cases due to uncertainty of sampling rather than 
to any peculiarity of the population. 

I have also thought it best to exclude Scottish Insane as well 
as the Dinka group'. We have already seen that Anglo-Indian 
variability is not significantly greater than the average variability 
of the whole series. The inclusion of any variability greater than 
Anglo-Indian variability will strengthen this conclusion, rejection 
of greater variabilities will go against our conclusion. The 
Insane is manifestly abnormal and may be neglected for the 
present. Variability of the Dinka people is greater than that of 
Anglo-Indians, its rejection will thus make the test more rigid. 
Separate figures for Aswan is also omitted for similar reasons. 1 

For the selected series of Standard Deviations 

Selected Mean Stand. Dev. —• 59 8929 mm. 

S.D. of Standard Dev. = 6*3504 mm. 

We notice that the selected Mean is almost exactly the same 
as the Mean for the whole series. We conclude that 60 mm. is 
about the true average absolute Variability of stature for human races. 

Due to selection the S.D. of Variability is considerably re¬ 
duced because the extreme values of Variability have in most cases 
been eliminated. 

Anglo-Indian S.D. =67-385 mm. 

Selected Mean S.D. =59'893 mm. 


Anglo Indian Difference = 7 49 2 mm. 


We find the probability :— 


*= V / O’ — 

From Biometric Tables II, £(1 + «) = 


7 

6 


:*> 2 = n8 

■350 

•88 09 99 9 
•11 90 00 1 


Thus n‘9% of all races will have greater variabilities than 
Anglo-Indians while 23% will differ more from the Selected Mean. 

As judged by a reliable series of standard deviations , the Abso¬ 
lute Variability of Anglo-Indians is not significantly greater than 
the Average Variability of different lC homogeneous ” samples. 


Relative Variability of Stature. 

We shall now compare the Relative Variability (as measured 
by the Coefficient of Variation) of our Anglo-Indian data with the 
variability of samples recognised to be homogeneous. 


1 J. I. Craig : Biometrika Vol. 8 (191.1), p. 70. . 
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(a) Whole Series. 

Distribution of 100 Coefficients of Variation of Stature . 


73 


Group 

1 80 
to 
2’20 

2*20 

to 

2*60 

2-60 

to 

300 

) 

3-00 

to 

3-40 . 

1 

3-40 

to 

3-80 

3-80 

to 

4-20 

4 - 20 

to 

4-60 

Beyond ! Total. 

1 

4-60 j 

1 

Frequency 

2 

3 

9 

20-5 

34-o 

20*5 

8x> 

3 100 

1 


' Grouping in units of 4, we find moment-coefficients about 
the Mean, 




188 66 


t* 3 = 

— 77 80 


^4 = 

ir-6i 74 74 

giving 

£.= 

•09 46 73 


& = 

3 3 7 26 20+ -63 30 

with 

sk. = 

•13 44 8 


Mean Coefficient of Variation = 3 ' 57 00 

and 

S.D. of Coefficient of Variation = ’ 545 ° 


Curve belongs to Type IV, but the Gaussian itself will be 
quite adequate. 

fc Goodness of Fit ” of Coefficients of Variation. 


Coeff. of V. 

Observed 

m'. 

Theoretical 

m. 

m — m’ 

(m—m') 2 

m 

Beyond 2*20 

2 

•742 

1-258 

2-1320 

2'20 — 2*6o 

3 

3-538 

•538 

•0818 

2*60—3 - oo 

9 

11-512 

2-512 

•5481 

3-00—3-40 

20-5 

22-912 

2-412 

•2531 

3-40—3-80 

34'0 

27-934 

6-o66 

1-3170 

3-80—4-20 

20-5 

20-769 

•239 

-0028 

4-20—4-60 

8-o 

9*459 

1 "459 

•2250 

Beyond 4-60 

, 3 ’o 

3'i30 

•130 

•0054 


n' = 8 


X 2 = 4’566o 

x l = 4*566' 

P = 71 21 63 


Thus the Gaussian gives excellent fit. In seven cases out of 
ten, the fit will be worse. 

We notice that one terminal frequency gives rather a large 
value i.e. 2*1320, combining the two end groups, we get, 

X 2 = 2‘555 

w'.= 8, P= -85 45 87 


The fit is now considerably improved. 1 conclude that the 
Coefficient of Variation {for homogenous groups) can itself be gra - 
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duated by the Gaussian curve. We can now safely apply the theory 
of Errors (which is based on the Gauss-Laplacian Probability Integral) 
to judge the likelihood of deviations from the Mean. 

Anglo-Indians V = 4*0672 
Average V = 3*5700 

Anglo-Indian Difference = ‘4972 

Now the S.D of V = * 545 ° 

Thus, P.E. of V - ±*3676 

Anglo-Indian Difference — -4972+ *3676 

D 4972 
# = — = -£2/- =*qi 

o- ’5450 

From Biometric Table II, |(i+a) = -8i 85 88 

|(i —«) = -i8 14 12 

Thus we find that no less than 18*14% of f< homogeneous ” 
races will have larger Coefficients of Variation than Anglo-Indians. 
The Anglo-Indian Coefficient of Variation is not significantly 
greater than the average Coefficient of Variation of the whole series. 

(b) Selected Series. 

We obtain the following distribution of the Coefficients of 
Variation for 55 selected 1 races (unit of grouping =*2). 


Distribution of 55 selected Coefficients of Variation. 


Group 

27 

to 

2*9 

! 

2*9 

to 

3 #I 

3 *i 

to 

3*3 

3*3 

to 

3*5 

3*5 

to 

37 

3*7 

to 

3*9 

3*9 

to 

4 *i 

4 *i 

to 

4*3 

Total. 

Frequency 

3 

5*5 

i*S 

9*0 

17*5 

9*5 

1 

4 *o 

5 *o 

55 

1 


We get. 

Mean Coefficient of Variation 
Standard Deviation of Coefficient of Variation 
P.E. of Mean V 

The other constants are:— 

H- 3*22 26 45 
/i 3 = 159 62 01 
^ = 29*89 12 68 
Pi = *06 61 53 
Pz= 2*97 93 

1 It will be noticed that the extreme values have been automatically excluded 
by our principle oi rejection of unreliable \ alues. 


= 3 , 57 I 
= 359 ° 
= -2421 
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The stability of the Mean Value is remarkable. For the whole 
series it was 3*57, for the selected races it is 3*571. It therefore 
seems likely that 3*57 is very near the true typical coefficient of 
variation (of stature ) for homogeneous non-Indian samples. 

The S.D. is much reduced by selection. This is now *3590 
as against *5450 for the whole series. We have selected the more 
reliable values, but this has also excluded almost all extreme 
values. Great divergence from the Mean value is thus probably 
due more to paucity of material than to actual peculiarities of 
distribution. 

Anglo-Indian V = 4*0672 
Selected Mean V — 3*5710 

Anglo-Indian Difference = *4962^*2421 

The actual difference is again the same, but this is now 
nearly twice the Probable Error. 

We have, 

D 49*62 0 , , 

x~ — = ——= 1*38 approximately 
" 35*90 

From Biometric Table II, §(1 + “) = ‘9 1 62 04 7 

£(1 Q ) ~ *08 37 95 3 

Thus 8*38% of all reliable samples will actually be more 
variable than Anglo-Indians, while 16*55% will differ more from 
the Mean. 

Anglo-Indian Variability of stature is not significantly higher 
than the average Variability of selected samples. 

(c) Selected and Weighted Series. 

Still another course is open to us. We can consider the 
“ weighted Mean ” 1 and “ weighted” Standard Deviation of the 
Coefficient of Variation. For this purpose, we choose our weights 
to be proportional to 1 /£*, where E is the probable error, i. e. 
give “ weights ” proportional to reliability. 

We get, 

Weighted Mean V — 37622 
Weighted S.D. of Mean V = *1846 

We notice that the Mean is now considerably higher. This is 
due to the much greater reliability in the measurements of the 
more civilised races, who have invariably higher variabilities. This 
greater value is also due in a large measure to the weight of the 
U.S.A. recruits (re; = 7623 against 10 for the lowest weight) which 
includes 25,898 individuals. 


1 See Yule : “ Theory of Statistics ” (Charles Griffin, 1919), p. 220. 
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Anglo-Indian V — 40672 

(Selected Weighted Mean V = 37622 

Anglo-Indian Difference = '3000 

P.E. of Difference = ± *1245 

x — ~ =163 approximately 

<T 

From Biometric Table II, £(1 + «) = 94 84 49 3 

i(i-a)= -05 15 50 7 

5*1% will be more variable, while 10'2% will differ more 
from the weighted average than Anglo-Indians. 

Thus even when compared with the weighted Mean , Anglo-Indian 
Variability is not significantly greater than average Variability. 

We have seen that U.S.A. recruits raise the weighted Mean 
very considerably. But it is not at all certain that the recruits of 
the U.S.A. Army are possessed of any great degree of homogeneity. 
One would surmise rather that they are heterogeneous in character 
Let us see the effect of leaving out U.S.A. recruits. 

Omitting U.S.A. recruits we get 

Weighted Mean V — 3-6413 

Weighted S.D. of V - ‘2509 

Weighted P.E- of V- + -1683 
Anglo-Indian V = 4*0672 

Weighted Mean V - 3 6413 


Difference 


^ = ;4259 

or *2509 


•4259±r6*83 

170 


From Biometric Table II, ^-(r -1-«) = *95 54 34 5 

|(i -a) = *04 45 65 5 


4 5% w iU have greater Variabilities than the Anglo-Indian 
sample. As regards the Coefficient of Variation, this is the most 
stringent test we can apply with the non-Indian material at our 
disposal. We find 

Anglo-Indian Variability is within the limits of probability of 
homogeneous Variation. Study of the Coefficient of Variation for 
Stature does not enable us to assert definitely that the present Anglo- 
Indian sample is heterogeneous in character. 

I shall now consider the whole series of non-Caste samples 
including the Mediterranian samples. I have omitted the separate 
age-groups for the New South Wales Criminals and the Oxford 
student data. As all these have greater Variability than the 
Average, the stringency of our test will not be diminished by this 
rejection. 1 Another reason why I have omitted the different age- 


i See discussion on p. 72 
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groups is this. My purpose is to compare the Anglo-Indian 
Variability with the general average Variability of other races. 
If the Coefficient of Variation for the same race is given several 
times over under different age-groups, too much weight will 
obviously be given to this particular race I also omit Dinka.' 


Distribution oj Coefficients of Variation of 107 non-Caste Samples. , 


Group 1 

1 

Be- ! 
yond ! 
i-8o | 

i-8o 

to 

1-90 

I-90 

to 

2*CO 

2*00 

to 

2*10 

2 10 

to 

2-20 

2*20 

to 

2*30 

2*30 

to 

2*40 

2*40 

to 

2*50 ! 

2*50 

to 

2 60 

2*60 

to 

270 

Frequency 

1 1 

! 1 

O 

0 

1 

1 

1 1 

O 

O 

I 

O 

2 

I 

Group 

-2*80 

“2*90 

-3 CO 

-3-10 

-3-20. 

-3-30 

- 3‘40 

-3-50 

-360 

1 

-370 

Frequency 

4 

1 3 

3 

8 

1 4 

3 

5 

6 

18 

1 9 

Group 

-3 80 

-3 90 

1 

-4-00. 

-4*10 

i 

; -4-2o 

-4-30 

-4-40 

-4 60 

-4*80 

-5-00 

Frequency 

7 

8 

9 

1 

i 

j 5 

4 

I 

I 

I 

I 


Grouping in units of *i, I find 

/* 2 = 28-99 60 64, 

m 3 = -5077 00 II, 

/^= 3252-08 73 71 

Thus £1= *10573 and 

A = 3*868 


Curve is of Type IV, but to a first approximation we can 
apply the f< normal ’ ’ curve of errors. 

Mean Coefficient of Variation (107 samples) =3 5353±’°348 
Standard Deviation of Coefficient of Variation = *5385 ±'0245 
The Mean Value is slightly lower than the one found earlier. 
This is due to the fact that I have omitted the Dinka group here. 
If we include the Dinka group, the Mean Value would be raised 
to 3*553 which compares favourably with the value 3*570, a 
difference of *035 only. 

Anglo-Indian Coefficient of Variation = 4-0672 
Mean Coefficient of Variation =3*5353 


Anglo-Indian Difference 


v _D _ 53i9_.„ ce 


From Biometric Table II, |(i + a)— 83 84 217 

|(i - a) = -i6 15 783 


*5319- 


1 See discussion on p. 62. 











78 


Records of the Indian Museum. 


[Vol. XXIII, 


Thus as before the Anglo-Indian sample does not seem to be 
significantly more variable than homogeneous samples. About 
16% of homogeneous samples will have a greater Variability. 

(&) Selected Series. 

Let us now select samples greater than 25. We get a total 
(omitting different age-groups) of 67 samples distributed as fol¬ 
lows : — 


Distributions of 6y Selected Coefficients of Variations. 


Group 

Be¬ 

yond 

270 

-2*80 

; -2*90 

-3*oo 

-3"io 

-3*20 

“ 3 * 3 ° 

i 

- 3*40 

-3-50 | -3-60 

Frequency 

2 

2 

1 

1 1 

5 

1 

i 

2 

I 

2 

5 

15 7 

Group | 

1 

-370 

-3*80! 

! 

-3 90 

-4-00 

-4/10 

-4*20 

1 

-4*30 

i 

Total. 


Frequency . ! 

6 

8 ! 

4 

I 

1 

I 4 

I 

I 




We get, i i 2 — 12-97 78 82 

/'s= 17774069 
#'4 = 47770 38 55 

giving / 3 ,= -14 43 94+'12 47 

A2= 2-83 53 67 +-38 2« 

Graduation by the “normal” curve is thus possible and we 
are justified in using the “normal” Probability Integral 
Mean Value of Coefficient of Variation = 3*5843 ±*0297 

Standard Deviation of Co-efficient of Variation = -3602 + 0210 
It will be noticed that the Mean Value 3*584 is sensibly the 
same as we had obtained without including this Mediterranean 
data e.c. 3*571. The difference is only *013 while the probable 
error is certainly greater than *03. Thus 3*58 may be safely 
taken as a standard value for the Coefficient of Variation for 
Stature of homogeneous 11011-Caste samples. 

The mean value for the whole series 3*5353 is smaller than 
the mean value for selected samples, ,3*5843, because in small 
samples the dispersion is more likeh 7 to be smaller. 1 

Let us now compare the Anglo-Indian Variability with the 
above Mean Variability. 

Anglo-Indian Coeff. of Variation =4*06 72 

Mean Selected Coeff. of Variation =3 58 43 

Anglo Indian Difference =0 48 29 


1 hor a discussion ot the dependence of Standard Deviation on the size of 
sample see Biometriha Voi. 10(1915) p. 572 and Yol. 11 (1916) p. 277. 
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Hence 

From Table II, 


D 0-4820 

cr O 3002 

i(i + a)= *90 98 773 

l|(l-a)= *09 01 227 


Thus nearly 9% of homogeneous samples will have a greater 
Variability. The inclusion of the new Mediterranean series does 
not affect our previous conclusion. 

The Variability of the Anglo-Indian sample though higher than 
the Average is not excessively so and the difference is not statistically 
significant. 

Indian Caste Variability. 


(a) Whole Series. 

I shall now consider the Coefficient of Variation of the Indian 
Caste data of Risley. Omitting 3 tribes in which thfc sample 
consists of only 2 individuals I get a total of 84 Castes and Tribes' 


Distribution of 84 Caste Coefficients of Variation. 


Group 

— 2*1 

- 2*2 

-2-3 

- 2 -4 

-2-5 

- 2*6 

-2 7 

- 2-8 

1 1 

- 2-9 

-3-o 

Frequency 

2 

I 

2 

c 

I 

0 

I 

3 

1 8 

7 

Group 

-3-i 

1 

-3*2 

-3'3 

-3-4 

1 ~ 3 * 5 

- 3-6 

-3*7 

-3-8 

-3'9 

-40 

Frequency 

5 

12 

! 

13 

7 

i 7 1 

6 

3 

0 

5 

I 


Grouping by *i, I get 

Mean Value of Caste Coefficient of Variation = 3’ 2 395 

Standard Deviation of Coefficient of Variation = *3943 

Anglo-Indian Coefficient of Variation =4-0672 

Anglo-Indian Difference _ = '8277 

•8277 


~ ’3943 


2-099, 


From Biometric Table II, * £(1 + «)=•'98 21 356 

l(i-a) = ’01 78 644 


Only about two per cent of Indian Caste samples will show 
greater variability. It seems therefore likely that the Anglor 
Indian sample is really differentiated from the Indian Castes in 
showing a just significant degree of greater variability. 

It should be noted that the Caste Variability is much lower 
than the non-Caste Variability. 

We have 

Non-Caste Variability =3‘57 00 ±’0368 
Caste Variability 3-2395+ -0290 


Caste 


Difference = * 3305 ±* 04 22 
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The difference is nearly eight times the probable error of the 
difference. Hence we conclude that Caste Variability is signi¬ 
ficantly lower than the Average Variability of other homogeneous 
samples. 

It is interesting to find that while the Anglo-Indian sample 
is not significantly more variable than non-Caste samples, it does 
seem to be just significantly more variable than Caste samples. 

The Anglo-Indian sample is “ mixed” from a Caste stand¬ 
point but is not so from the standpoint 6f ordinary stable popu¬ 
lations. We shall see later that the Anglo-Indians are about as 
variable as modern European samples. 

(&) Selected Indian Castes. 

I now select samples of. 25 and more from the Caste data. 


Distributions of yo Selected Caste Coefficients of Variation. 


Group 



Frequency- 


Group 



- 3'5 -3'6 -37 I -3*8 ~3 "9 Total.; 



With ‘i as the grouping unit, I find 

Mean Selected Coefficient cf Variation = 3*3043+ '0278 
Standard Deviation of Coeff. of Variation = *3458 ±*0197 
Mean non-Caste Coeff. of Variation =3*5710 + *0326 

Caste Difference =0-16 67 + *0429 


In this case also the difference is nearly four times the prob¬ 
able error. We conclude, that the Indian Caste samples have got a 
substantially lower Variability than non-Caste samples. 

We shall now compare Anglo-Indian Variability with the 
selected Caste Variability. 

Anglo-Indian Variability =4'0672 
Selected Caste Variability =3.3043 


Anglo-Indian Difference = 7629 

Thus * = 2 ? 2 § = 2 -8o6 

3458 

From Biometric Table II, £(1 + a) = -98 64 474 

|(i-«) = -oi 35 526 

The chance is only 13 in 1000 that the Variability of an 
Indian Caste will be greater than Anglo-Indian Variability. This 
is the lowest odds we have got up till now. 
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To sum up, 

The Anglo-Indian Variability is significantly greater than Caste 
Variability but is not beyond the range of homogeneous Variability. 

Other Comparisons. 

I shall give a short summary of other comparisons, reserv¬ 
ing a fuller discussion for a future paper on the Caste data 

Pooling together the 84 Caste and the 109 other samples we 
get a total of 193 (all samples). 

I find 

Mean Value of Coefficient of Variation =3*4231+ ’0240 
Anglo-Indian Co-efficient of Variation =4*0672 


Thus 

From Biometric Table 


Anglo-Indian Difference = '6441 

Standard Deviation = *4949+ *0169 
6441 


x — 


— i’3 0I > 


•4949 

i( 1 ± a ) = '9° 3i 995 
b(i-a)= og 68 005. 


Anglo-Indian Variability would be exceeded by nearly 10% of 
total (Caste and non-Caste) samples. 

Selecting samples greater than 25 we get a total of 137 fairly 
reliable samples. 


Distribution of 137 Selected Coefficients of Variation. 


Group 

Be- 
! yond 

. . 1 2’2 

1 

| 

-2*3 

-2*4 

| 

-2*5 

-2-6 

-27 

-2*8 

i 

-2-9 ; 

- 3 *o 

-3 1 

Frequency 

i ‘ 

I 

0 

I 

O 

3 

4 

8 

IO 

. 

5 

Group 

| - 3*2 

| 

- 3*3 

1 

- 3*5 

- 3-6 

-3*7 

-3*8 

- 3*9 

-4-0 

- 4 *i 

Frequency 

i ,3 

14 

■ 

21 

12 

9 

8 

9 

1 

4 

l 

Group 

i 

j -4*2 

- 4*3 

Total. 








Frequency 

I 

I 

137 









Grouping by *i I find :— 

14*42 12 29 

- i7’9 8 72 90 
w 4 = 760*82 26 19 

Hence 

0i = *10796+ *14439 

3*65892+ *91379 

Thus we are justified in applying the normal integral for cal¬ 
culating the chances for any deviation. 
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Anglo-Indian Variability = 4-0672 
Mean Selected (137 samples) =3-4412 + -0219 

Anglo-Indian Difference = '6260 

Standard Deviation = ' 3797±' OI 55 
•6260 , 

X — -=I*02. 

3797 

From Biometric Table'll, £(1 + a) — '94 73 839 

£(1 - a) = -05 26 161 

Thus over 5% will have greater Variability. The difference 
can scarcely be called significant. 


Standard Deviations. 

I shall merely give the final results. (The complete figures 
will be published in a supplement). 

(a) All Samples (Caste and others) total =190. 

Mean Standard Dev. =57 0684+ 42 71 mm. 
Anglo-Indian Standard Dev. =67-3850 


Anglo-Indian Difference =10*3167 
S.D. of Standard Dev. = 87302 +3020. 
103167 


x = 


— 1*181. 


8-7302 

From Biometric Table II, £(1 + a) = *88 09 999. 

\{i — a)= . -17 90 001. 


Thus nearly 18% will have a greater Standard Deviation 
than the Anglo-Indian sample. 

( b) Selected Samples (Caste and others) greater than 25, iotal= 134 

Mean Standard Dev. = 56*76 t 2 + -3987 mm. 
Anglo-Indian Standard Dev. =67-385 


Anglo-Indian Difference = 10-6238 
S.D. of Standard Deviation = 6-8424 

10-6238 

534" r552 ’ 

From Biometric Table II, £(r + «)= '93 94 292 

£(i — a)= *0605708 

Six per cent will have a greater variability than the Anglo- 
Indians. 

(a) All Non-Caste Samples t total= ro6. 

Mean Standard Dev. = 59’2830 + -6i38 mm. 
Anglo-Indian Standard Dev. =67-385 
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Anglo-Indian Difference = 8*102 
S.D. of Standard Deviation = 9*3688+ -4340 

8*102 Q£ 

* = ^688 = 0 ' 864 - 

From Biometric Table II, £(1 + a) = *8o 51 055 

'1948945 

Over 19% will have greater absolute variability. 

( b ) Selected Non-Caste Samples greater than 25, total = 64. 

Mean Standard Dev. =60*6563+ 5453 mm. 
Anglo-Indian Difference = 6*7287 
S.D. of Standard Deviation = 6*4676+ 3856 

6*7287 

Xz= T-7TZ= T ’°4 
6*4676 

\{i+a)= *85083 
£(1 “ «) = ' I 49 T 7 
(a) All Caste Samples total = 84 

Mean Caste S.D. = 53 " 07 i 4 ±* 4 6 93 min. 
Anglo-Indian S.D. =67*385 
Anglo-Indian Difference =14*314 
S.D. of Standard Deviation = 6*3785+ *3320 


14*314 

x — — 2*244, 

6-3785 


From Biometric Table II, £(1 + a) = *98 74 545 

4(i-«)= 0125455 

Only 12 in 1000 castes will have a greater variability than 
the Anglo-Indian sample. Thus we may conclude that the Abso¬ 
lute Variability of the Anglo-Indian sample is appreciably greater 
than Caste Variability. 

Also 

Non-Caste Mean S.D. = 59-2830 +*6138 mm 
Caste Mean S.D. =53 0714+*4693 


Caste Difference 6*2116+ *7727 


Thus Absolute Variability of Caste samples is significantly 
greater than Non-Caste Variability. 

(b) Selected Caste Samples greater than 25, total-y o 

Mean Selected Caste S.D. =53*8 ± *4429 mm. 

Anglo-Indian S.D. =67*385 
Anglo-Indian Difference =13*585 ±2*471 
S.D. of Standard Deviation = 5*4938± *3 I 3i 

* = = 

5*4938 


2*471, 
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From Biometric Table II, |(i + a)= ‘99 3 2 443 

i(i - a) = *0067557 

Thus only about 7 in 1000 will have greater variability. 
Again, 

Selected Non-Caste S.D. =60*6563+ 5453 mm. 
Selected Caste S.D. =53 8 +‘4429 

Anglo-Indian Difference = 6*8563+ *7025 

Selected Caste Variability is thus significantly greater. 

We conclude from our comparative study of variabilities that 
Anglo-Indian Variability though high is not sufficiently so to enable 
us to assert that the material is heterogeneous. The Anglo-Indian 
sample is however markedly more variable than the Rislay Samples 
of Indian Castes and Tribes. 

I shall now consider a series of modern European races for 
which reliable data is available. 



Modern European Races. 



S.D. 



Aberdeen students (493) 

59*4. mm. 



Cyprus (585) 

61 -6 

Anglo-Indian S.D. 

=67*385 

Cambridge students (1000) 

64*6 

Average European S.D. 

=65775 

U.S.A. recruits (25,898) 

65-6 



Albanians (140) 

657 

Anglo-Indian Difference 

= 1'61 

N.S.W. Criminals (2871) 

65*8 

S.D. of S D. 

= 275 

Oxford students (959) 

66-1 

Anglo-Indian excess in 

terms of 

Germans (390) 

66-8 

= i , 6 x /275= o , 58 

Crete (318) 

67-5 



Eng. Criminals (3000) 

68-i 



Eng. Fathers (1078) 

687 



Eng. Sons (1078) 

69*4 




Thus the Anglo-Indian variability is only i*6i mm. greater 
than average variability of European races. We have however 
included no less than five different English samples. If we retain 
the largest English sample (3000 criminals) we get the Mean varia¬ 
bility to be 65*375 mm. with a S.D. of 2*513 mm. The Anglo- 
Indian excess is 2*1 mm. or in terms of the S.D. is 0*79586. 

We conclude that Anglo-Indian Variability is of the same 
order as modern European variability. 


Conclusions. 

I have proposed five distinct tests of (< homogeneity/ ’ 

I The frequency distribution should be homotypic. 

II It should resist statistical dissection ; 

III Subsamples should not differ significantly, 

IV The general nature of the distribution should be similar 
to homogeneous distribution. 

V The Variability should not differ significantly from the 
average Variability of homogeneous races. 
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(1) I have shown that graduation by the Gaussian (possibly 
still better by a Type IV curve) is adequate. Anglo-Indian fre¬ 
quency distribution is certainly homotypic. Our first test thus fails 
to show an3 r sign of heterogeneity in the material. 

(2) Excepting for a very special type of dissection (which is 
probably a peculiar feature of the particular sample considered) 
statistical analysis into component groups is not possible. Our 
second test too fails to detect heterogeneity. 

(3) We have seen that the difference between subsamples is 
statistically insignificant. Subsamples seem to agree quite well , 
thus confirming statistical homogeneity of the material. 

(4) The general nature of Anglo-Indian frequency distribution 
is also similar to other homogeneous distribution. Anglo-Indian 
distribution is approximately Gaussian with some tendency towards 
type IV, lepto-kurtosis and small asymmetry. Other known cases 
of stature distribution show the same characteristics. The fourth 
test thus supports the view that the present material is homo¬ 
geneous. 

(5) I have compared the Variability of the Anglo-Indian with 
Variabilities of other races in many different ways. 

Anglo-Indians are more variable than the Indian Castes and 
Tribes but the Variability of the Anglo-Indian sample is not signi¬ 
ficantly greater than the average’ Variability of homogeneous 
samples in general. 



SECTION VIII. NOTE ON CORRELATION BETWEEN 

AGE AND STATURE. 


I shall give a short summary of the values of the Coefficient 
of Correlation between Age and Stature, reserving a fuller discus¬ 
sion for a future part. 


(a) The whole series (all ages), total—igi. 

The age has been recorded in the case of 191 out of the total 
group of 200 which we have been considering so far. I have used 
the standard “product moment ” method. 1 

I find for stature, with 50 mm. unit of grouping and 1660 mm. 
as base number, 

v/- - *14136, and 

Thus Mean Stature = 1656*14 mm. 

S.D. = o-x = 65 4923 mm. 

For age, with one year unit of grouping and base number = 24 
years, 

+*2 7 

and = 44*98. 

Thus Mean Age = 24*27 years 

S.D.= cry = 6*7022 years. 


With the same units and base numbers we find the product 
moment to be + 40* 26 70. 

Correcting for base number, we have 

Mi 1 = Product moment = +40*22 


Thus 


r= + 


40*22 


6*7022 x 65*4923 
= + *1089 

The Probable Error is given by *6745 (1—r*)/v^ 
N = 191, hence P.E. is= ±*049. 

We have then 

r— + *1089 +*049. 


n 


The correlation coefficient is slightly over twice its Probable 
Error, hence it is not definitely significant. In any case the correla¬ 
tion between age and stature seems to be small. 

The low average age of the whole sample shows the presence 
of a considerable number of individuals in their early youth. I 
next separated the measurements of those above 25 years of age 


1 Yule, Statistics Chap. IX. 

Karl Pearson: “Regression, Heredity and Panminia ” Phil. Trans. Roy. 
Soc. Vol. 187A, 1896. 
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from the measurements of those below 25, and considered the 
correlation for the two different age groups separately. 

(6) Age group below 25, total = 125. 

I find for age 

Mean Age = 20-52 years 

S.D. = ^ = 2*2449 years 

For stature, Mean Stature =1649*35 mm. 

S.D. = or* = 61-35 

Also ,) = Product moment = + 20-16 

We notice that the average stature of the lower age group is 
only 7 mm. less than the general average. The S.D. is also less 
than the general average, showing that the lower age group is less 
variable than the general sample. I shall come back to this point 
later on. 

We find the coefficient of correlation to be 

r = + -1464+ -058 

The correlation is positive but small. It is just on the verge 
of being significant. The positive character of the coefficient is of 
course expected, it merely indicates, or rather actually measures 
the average rate of growth with age. The material includes only 
a few cases of 16, the lowest age group, and so it is not possible to 
say very much about the actual variations in the rate of growth. 
The smallness of the coefficient (if not due to errors of sampling) 
seems to suggest that the greater part of the increase in stature is 
attained .before the age of 16 or 17. Thus the Anglo-Indian seems 
to be, so far as stature is concerned, rather precocious in growth. 
I shall discuss this point after investigating the correlation between 
age and the other characters. 

(c) Age group above 25, total = 66. 

I find 

Mean age = years 

S.D.= = 6*5765 years 

Mean Stature =1688-1818 mm. 

S.D.= <J X = 71*072 mm. 

Product moment = — 55’4^^4 
Thus, r- -*ii87±-o8. 

The coefficient is now negative but is scarcely significant in 
view of its large probable error. A small negative correlation is to 
be expected in view of the shrinkage which sets in after 25 or 30. 1 

Powys: “ Anthropometric Data from Australia," Biometrika Vol, I (1902) 


p. 49 - 
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The value of the Absolute Variability is for the 

Lower age group =65-4923 + 27939 mm. 

Higher age group =71*0720+4-1726 mm. 
Difference = 5*5797 + 5*02 mm. 

The variability of the younger group is thus considerably 
less, but the difference is scarcely significant. Even though we 
cannot definitely assert that the variability is being reduced with 
time, the above noticed decrease is certainly interesting as giving 
an indication that such a view is not altogether untenable. 

If we turn to the Relative Variabilitj 7 , i.e. the Coefficient of 
Variation, we find 

Higher age group =4*2073 + *2470 mm. 

Lower age group =3*9545+ *1687 
Difference =0*2528 +*2991 

The difference is less significant than the previous one. But 
the reduction in even the relative variability is distinctly sugges¬ 
tive. 

Another point must be carefully noted. The variability of the 
Anglo-Indian sample is not significantly diminished by selection 
of age groups. Thus the high value of the variability (both 
absolute and relative) is not merely due to the mixing of the 
different age groups but represents a real degree of dispersion. 



SECTION IX. SUMMARY OF CONCLUSIONS. 

Statistical. 

(1) For stature, with samples of the order of 200, a group¬ 
ing unit of 50 mm. is fairly satisfactory. For calculating fre¬ 
quency constants the grouping unit should be less than 3'5\/iV 
(for samples of size N ). 

(2) Sheppard’s corrections lead to substantial improvement 
in the frequency constants and should never be omitted. With 
small samples finer corrections (e.g. Pairman and Pearson) are use¬ 
less. 

(3) The Gauss-Laplacian normal curve is adequate for 50 
mm. grouping. For proper graduation, i.e. for testing goodness 
of fit the grouping should be broader than yoo/^/N 

(4) The actual frequency curve belongs to Type IV of Pear- 
.son’s Skew family. There is small positive asymmetry with the 
Mode greater than the Mean, and a slight tendency towards lepto- 
kurtosis. The general nature of the distribution is similar to other 
homogeneous distributions. 

(5) There is no definite evidence of statistical heterogeneity. 
The Anglo-Indian sample may be accepted as a statistically homo¬ 
geneous sample. 

Anthropological {Stature). 

(1) The more highly civilised races have greater variabilities 
than the average. 

(2) This greater variability of more highly civilised races seem 
to be only moderate in degree and is never excessive. 

(3) Interracially, taller races seem to be more variable than 
the shorter (both as regards the absolute and the relative variabi¬ 
lity). 

(4) Indian Castes and Tribes are significantly less variable 
than the average. 

(5) Anglo-Indian variability is greater than Indian Caste va-. 
riability but is of the same order as the variability of modern 
European races. 

(6) The variability of the Anglo-Indian sample though greater 
than the average is not beyond the range of possibility of homo¬ 
geneous variability 

(7) The Anglo-Indians seem to be rather precocious in 
growth, and there is some indication of the arrest of growth 
occurring at an earlier age than in the case of European races. 

(8) Variability of the smaller age-groups is distinctly less, 
showing a decrease of variability with time (or increasing homo¬ 
geneity of the younger generation). 



APPENDIX I. NOTE ON STATISTICAL TERMS. 

In this appendix I have made an attempt to explain, in non- 
mathematical language, some of the more frequently occurring 
technical terms of statistical theory. Considerations of space have 
prevented me from giving concrete illustrations. I hope however 
that the following pages will serve some useful purpose in helping 
anthropologists who lack the requisite mathematical training, in 
taking an intelligent interest in the various technical discussions 
contained in this paper. I have only attempted to give a general 
idea of the different terms; the statistician will, I hope, forgive 
me for the consequent lack of precision in many places. 

Let us consider our 200 measurements of Anglo-Indian stature. 
Almost all individual measurements are different from One another. 
The existence of variability is patent. The important fact is, 
however, that this variability of stature is not chaotic in its dis¬ 
tribution, but that it is governed by definite laws. 1 

We can classify our material into different groups in accord¬ 
ance with size. We find, for example, that there are 2 individu¬ 
als whose heights are less than 1465 mm. Between 1465 and 1485, 
there is only one. Between 1485 and 1505, there are 4, and so 
on. Thus with a 20 mm. unit of grouping, we get the following 
distribution of frequency in each group. (The number of indivi¬ 
duals in any group is called the frequency of that group). 


Frequency Groups in units of 20 mm. 


Group 

1445 nlm. 
to 

1465 mm. 

1465 mm. 
to 

1485 

1485 

to 

150s 

1 

1505 

to 

1525 

1525 

to 

1545 

T 545 
to 

1565 

1565 

to 

1585 

1585 

to 

1605 

1605] 1625 
to 1 to 
1625 j 164s 

Number 

2 

I 

i ] 

I 

1 

4 

! 

2 

4 

IO 

12 

2 5 

32 

Group 

164s 

to 

1665 

’1665 

to 

1685 

1 

1685 1705 
to to 

1705 j 1725 

1725 

to 

174s 

1745 

to 

1765 

S1765 

! to 
ji 7«5 

1785 

to 

1805 

1805 

to 

1825 

1825 

to 

1845 

1845 

to 

1865 

I 

1 

Total 

1 

i 

Number 

^ 21 

17 

21*5 18-5 

1 

10 

5 

1 IO 

i 

2 

1 

1 0 

I 

1 I 

200 


These frequency groups are shown graphically in Plate I. 

Let the horizontal #-axis represent stature. Then, at the 
middle point of each group, we can erect vertical lines propor¬ 
tional to the frequency in that group. For example, at 1455, 
which is the middlepoint of the group 1445-1465, we erect 
a vertical line whose length is two units, to represent the 
frequency in that group. At 1475, the height of the vertical 
line is one unit and so'on. If the extremeties of these vertical 


Cf. Goring : “ The English Convict,” 


p. 29. 
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lines are joined by straight lines, we get the corresponding 
frequency polygon. With 20 mm. unit of grouping, the polygon 
is broken and irregular in outline, because many intermediate 
measurements are missing in the sample. 

If we gradually increase the size of our sample, more and 
more of these gaps will be filled up and tfie polygon will become 
more and more regular. On the other hand, with an indefinitely 
large sample, we can make the size of each group as small as we 
please, without incurring aiiv risk of meeting with gaps in the 
measurements. Thus, with a very large sample, and when the size 
of each group is indefinitely diminished, the discontinuous broken 
polygon will gradually pass into a continuous smooth curve. This 
frequency curve will give us the distribution of stature of an indef- 
initefy large population. 

Such distributions are usually termed Chance distributions. 
But as Pearson observes, 1 “ in the first place, we have to recognise 
that our conception of chance is now utterly different from that 
of yore. Where we cannot predict, where we do not find order 
and regularity, there we should now assert that something else 
than chance is at work. What we are to understand by a chance 
distribution is one in accordance with law and order, and one the 
nature of which can for all practical purposes be closely pre¬ 
dicted. It is not theory, but actual statistical experience, 

which forces us to the conclusion that, however little we know of 
what will happen in the individual instance, yet the frequency of 
a large number of instances is distributed round the mode in a 
manner more and more smooth and uniform the greater the num¬ 
ber of instances. Our conception of chance is one of law 

and order in large numbers; it is not that idea of chaotic incidence 
that vexed the mediaeval mind." 

The Gaussian distribution (named after the great mathe¬ 
matician Gauss) is one important standard type. It has got the 
following characteristics:— 

(a) The frequency is maximum for the average value of the 
organ measured. 

(h) The distribution is symmetrical with regard to this 
maximum. 

( c ) The curve slopes down, gradually and in a characteristic 
way, to zero, so that extreme degrees of variation become increas¬ 
ingly rare. 

(d) The curve ends tangentially to the #-axis, so that infinitely 
large degrees of variation are theoretically possible. 

Variability .—We have not yet investigated the question of vari¬ 
ability of the distribution. Two frequency distributions may be 
both Gaussian and yet their variabilities may differ widely. 
Anthropologists have often used the range , which is defined as the 
difference in size of the most extreme members, as a measure of 
variability. A little reflection will, however, show that the range 


1 Chances of Death, Vol. 1 , p. 11. 
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is not at all suitable for this purpose. The inclusion in the sample 
of a single abnormal “ dwarf” or “giant” will completely upset 
the value of the range. A measure so radically affected by stray 
items at the extremes is practically useless for scientific purposes. 1 

In current statistical practice it is usual to measure variability 
by the Standard Deviation. The deviation of each measurement 
from the Mean (or Average) is squared. The sum of all such 
squares divided by their total number gives the second moment /* 2 , 
which is thus the average squared-deviation of all tile measurements. 
The square-root of ih finally, gives the Standard Deviation. It 
is the average root-square deviation of all the measurements, 
and is a precise mathematical measure of the variability of the 
sample. One great advantage in using the Standard Deviation is 
this that it uniquely defines the corresponding Gaussian curve, so 
that the Gaussian can be found as soon as the Standard Deviation 
is determined. Standard Deviation (or S.D.) is usually represent¬ 
ed by <r. 

Probable Errors .—The Gaussian distribution is also known 
as the “ normal curve of errors,” since it is assumed that this 
curve gives the distribution of “errors” made in physical 
measurements. 2 * The greater the diversity in any set of measure¬ 
ments the greater will be the Standard Deviation of the set. 
Accuracy or reliabilit} 7 depends on the uniformit}* of the set of 
measurements, that is, on the smallness of the Standard Deviation 
The “probable error,” which measures the accuracy or reliability 
of any set of measurements, is hence suitably defined by a parti¬ 
cular sub -multiple of the Standard Deviation. 

If v is adopted as the unit of measurements (that is, all 
measurements in terms of ordinary units are divided by^), then the 
curve of errors becomes the standard curve of probabilhy. The 
mathematical theory of probability then enables us to find the 
probability of any given deviation from the Mean occurring in the 
sample. 

For example, a deviation half of the Standard Deviation will 
occur no less than 62 times in loot samples. A deviation as great 
as the Standard Deviation will occur in 32% instances, while a 
deviation four times as great will not happen more than once in 
17, 000 instances. The Probable Error is defined to be such a 
deviation as will be exceeded by half the total deviations, or in 
other words, the chances are even that any deviation will be great¬ 
er than or less than the Probable Error. 

We must now come back to Anthropology. It is well known 
that almost all anthropometric measurements have an approxim¬ 
ately Gaussian distribution. This was originally pointed out by 
Quatelet, and since then has been confirmed by many different 


1 For a simple non-technical account of the different measures of dispersion, 
see King: “ Elements of Statistical Theory” (MacMillan, 1919), p. 141. 

2 This assumption itself is not always strictly true. See Pearson’s memoir on. 

“ Errors of Judgement, etc.” Phil. Trans. Roy. Soc. 198A (1902). 
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observers. 1 * But it must be remembered that the distribution is 
onfy approximately normal and is almost never exactly so. We are 
thus obliged to study other types of frequency distribution. 

It is often found that the maximum frequency does not occur 
at the Mean value of the character concerned. In such cases, the 
most frequent size, that is, the position of the maximum ordinate, 
is called the Mode. In anthropometric measurements it is very 
usual to find the Mode different from the Mean. When this hap¬ 
pens, the distribution is no longer symmetrical about the Mean. 
Such asymmetrical distributions are called skew distributions * 

The distance between the Mode and the Mean is one obvious 
measure of skewness , or better still (for purposes of comparison), 
this distance divided bj 1, the Standard Deviation. The mathemati¬ 
cal measure of skewness depends on the third moment obtained 
by cubing the deviations from the Mean and taking the average. 
The positive and negative deviations (from the Mean) must, by the 
very definition of the Mean, balance exactly; so that the sum of 
all deviations is zero. For a symmetrical curve this is also true 
of the cubes of deviations. But in the case of an asymmetrical 
curve, the sum of all the cubes of deviations is not zero. Hence 
the third moment, which is merely the average-sum of the cubes 
of all deviations, is not equal to zero. Thus ju 3 or more con¬ 
veniently yS, = /u 3 a //w ? s is a precise measure of the degree of asymmetry. 
If is significantly different from zero, then the curve must be 
considered skew. 

Frequency distributions may differ from the normal curve in 
another particular. The change of slope of the normal curve is 
a characteristic feature of the curve. Now a frequency curve may 
differ from the normal as regards the manner in which its slope 
changes. For example, if a curve rises more abruptly, than the 
normal curve,, it is then called a lepto-kurtic curve. While if it is 
more flat-topped than the normal, it is called a platy-kurtic curve. 
Curves with the same degree of abruptness as the normal are 
known as meso-kurtic curves. The kurtosis is measured by A~3- 
For meso-kurtic curves & is equal to 3, and the kurtosis is zero. 
For lepto-kurtic ft* is greater than 3, and for platy-kurtic it is less 
than 3. A frequency curve may also differ from the normal in 
having a definitely limited range. The curve may be limited in 
one or in both directions. With these curves there is a definite 
theoretical limit to the size of deviations. 

The Coefficient of Variation. —Pearson 3 says, “ In dealing with 
the comparative variation of men and women.. , we have con¬ 

stantly to bear in mind that relative size influences not only the 
means but the deviations from the means. When dealing 
with absolute measurements, it is, of course idle to compare the 


1 For references see pp. 42-44. 

- P'or literature on the subject see references quoted on p. 16. Also J. C 
Kapteyn: “ Skew Frequency Curves in Biology and Statistics.’ 

3 Karl Pearson: “Regression, Heredity and Panmixia,” Phil. Trans. 
Roy. Soc., Vol. 187A, 1896, pp. 276-277. 
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variation of the larger male organ direct^ with the variation of 
the smaller female organ. The same remark also applies to the 
comparison of large and small built races . We may take 
as a measure of variation the ratio of Standard Deviation to mean, 
or what is more convenient, this quantity multiplied by 100. We 
shall, accordingly, define V, the coefficient of variation, as th$, 
percentage variation in the mean, the Standard Deviation being 
treated as the total variation in the mean .. Of course, it does 
not follow because we have defined in this manner our “ coefficient 
of variation,” that this coefficient is really significant in the 
comparison of various races , it may be only a convenient mathe¬ 
matical expression, but I believe there is evidence to show that it 
is a more reliable test of (< efficienc} 1 '” in a race than absolute 
variation . . By “ race efficiency,” I would denote stability, 

combined with capacity to play a part in the history of civilisa¬ 
tion.” 



APPENDIX II. 

Table of Measurements. 

Stature of Anglo-Indians measured in the Anthropological Laboratory of 

the Indian Mttseum, Calcutta. 


Card 

Age in 

Stature 

No. 

years. 

in mm. 

8 7 

15 

1446 

166 

*6 

1624 

186 

l6 

1588 

147 

l6 

1666 

144 

l6 

1726 

250 

17 

1656 

175 

17 

1588 

145 

17 

1588 

289 

17 

1544 

76 

17 

1642 

x 26 

17 

1610 

143 

17 

1662 

44 

17 

1706 

253 

18 

1746 ” 

141 

18 

1768 

132- 

18 

1610 

258 

18 

1602 

86 

18 

1636 

191 

18 

1660 

160 

18 

1570 

251 

18 

1574 

94 

18 

1580 

288 

19 

1638 

2 77 

19 

1636 

294 

19 

1634 

66 

19 

1630 

286 

19 

1626 

298 

19 

1614 

53 

19 

1604 

75 

19 

1586 

288 

19 

1458 

226 

19 

1768 

4 

19 

1768 

151 

19 

1760 

295 

19 

1744 

174 

19 

1718 

299 

19 

1705 

6 

19 

1706 

56 

19 

1780 

146 

19 

1674 

176 

19 

1686 

73 

19 

1666 

246 

19 

1550 

26 

19 

1646 

140 

19 

1644 

9i 

19 

1640 

46 

20 

1716 

no 

20 

1712 

142 

20 

1710 

235 

20 

1700 

8 

20 

1680 

175 

20 

1670 


Card 

Age in 

Stature 

No. 

Years. 

in mm. 

14B. 

20 

1673 

168 

20 

1664 

241 

20 

1638 

156 

20 

1622 

280 

2° 

1622 

248 

20 

1562 

65' 

20 

1500 

275 

20 

1510 

217 

.20 

1514 

152 

20 

1610 

67 

20 

1650 

219 

20 

1620 

172 

20 

1658 1 

107 

21 

1626 

102 

21 

1636 

111 

21 

1650 

287 

21 

1654 

234 

21 

1656 

99 

21 

1708 

133 

21 

1730 

IOl 

21 

1768 J 

51 

21 

1768 

106 

21 

1704 ; 

10 

21 

1694 : 

281 

21 

1696 | 

28 

21 

1672 

227 

21 

1678 j 

267 

21 

1624 

88 

21 

1628 

9 

22 

1730 

148 

22 

1726 

74 

22 

1716 

180 

22 

1700 j 

149 

22 

1700 

108 

22 

1684 

103 

22 

1688 

170 

22 

1677 

72 

22 

1650 

96 

22 

1568 

43 

22 

1576 

177 

22 

1608 

25 

22 

1644 

68 

22 

1644 

7 

22 

1636 

136 

22 

I636 ! 

134 

22 

l6l6 j 

243 

22 

1654 i 

62 

22 

1658 : 

40 

33 

1692 

11 

23 

1692 

265 

23 

1680 

12B. 

23 

1775 


Card 

Age in 

Stature 

No. 

years. 

in mm. 

64 

23 

1472 

6l 

23 

1572 

269 

23 

1624 

42 

23 

1646 

224 

24 

1592 

45 

24 

1610 

54 

24 

1620 

2 97 

24 

1690 

50 

24 

1670 

X 

24 

1684 

13 

24 

1696 

230 

24 

1634 

284 

24 

1596 

276 

24 

1636 

47 

24 

1644 

57 

25 

1738 

262 

25 

1513 

54 

25 

1580 

285 

25 

1619 

48 

25 

1630 

3 

25 

1034 

293 

26 

1644 

240 

26 

1638 

282 

26 

1656 

263 

26 

1730 

2 

26 

1710 

60 

26 

1604 

58 

26 

1628 

231 

26 

1616 

63 

27 

1522 

27 

27 

1700 

119 

27 

1692 

38 

27 

1770 

39 

27 

1776 

29 

27 

1796 

137 

27 

1840 

20 

27 

1656 

h 

27 

1650 

31 

28 

1610 

232 

28 

1636 

223 

28 

1754 

32 

28 

1662 

78 

28 

1662 

271 

29 

1780 

268 

29 

1584 

36 

29 

1730 

278 

29 

1722 

247 

29 

1620 

33 

29 

1608 

279 

29 

1562 

55 

29 

£578_ 

35 

30 

1656 
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Card 

Age in 

Stature 

No 

years. 

in mm. 

135 

30 

1712 

22 

3 ° 

1734 

34 

30 

1760 

260 

30 

1708 

216 

30 

1684 

232 

30 

1640 

229 

30 

1628 

77 

30 

1614 

18 

30 

1672 

2 66 

30 

1694 

37 

30 

1698 

00 

CN 

3 i_ 

1716 

220 

32 

1640 

97 

32 

1606 

105 

32 

1592 


Card 

Age in 

Stature 

No. 

years. 

in mm. 

19 

32 

1714 

201 

32 

1720 

70 

32 

1734 

1 59 

33 

1617 

264 

33 

1624 

_ 93 . 

33 

1788 

79 

35 

"1704" 

155 

35 

1722 

17 

35 

1670 

7 1 

33 

1644 

82 

39 

1610 

92 

39 

1714 

252 

40 

1848 

1 3 J 

4 ' 

1638 


Card 

Age in 

Stature 

No 

years. 

in mm 

49 

42 

1704 

52 

42 

1756 

165 

43 

1540 

15 

44 

1610 

5 

45 

1598 

95 

48 

! 574 

98 


1554 

16 


1586 

228 


1632 

30 


1654 

12 


1670 

150 


1690 

11 3 


1694 

112 


1700 

6B 


1711 






